Discussion:
A luaL_Buffer question
Marc Balmer
2017-11-02 07:41:27 UTC
Permalink
Is it possible to create a luaL_Buffer over several calls to Lua functions? E.g. like in the following pseudo code:

d = foo.new() -- Does a luaL_buffinit() in the C code
d:func_a() -- Adds some values to the buffer

-- do a lot of other stuff

d:func_b() -- Ass some more values to the buffer

-- do more stuff

d:flush() -- Does something with the buffer

Or must all buffer operations be finished when I return from C code?
Dirk Laurie
2017-11-02 08:21:22 UTC
Permalink
Post by Marc Balmer
d = foo.new() -- Does a luaL_buffinit() in the C code
d:func_a() -- Adds some values to the buffer
-- do a lot of other stuff
d:func_b() -- Ass some more values to the buffer
-- do more stuff
d:flush() -- Does something with the buffer
Or must all buffer operations be finished when I return from C code?
The C code must know that Lua depends on the buffer remaining intact.
The standard way of doing that is to create a full userdata whose
storage block is the new luaL_Buffer, with your functions as methods.

The userdata belongs to Lua. As long as there still is a reference to
that object alive somewhere, it should not be collected. You might
need a __gc metamethod which frees the structure to make sure it is
eventually cleaned away.
Viacheslav Usov
2017-11-02 10:54:13 UTC
Permalink
Post by Marc Balmer
Is it possible to create a luaL_Buffer over several calls to Lua
functions? E.g. like in the following pseudo code:

[...]
Post by Marc Balmer
Or must all buffer operations be finished when I return from C code?
I never used that before, so it was interesting for me to understand how
that works.

I have learnt that (with Lua 5.3) luaL_Buffer has an initial buffer space
as a field. When, however, the user wants to add more data than the initial
buffer can hold, a userdatum pointing to a new buffer space is allocated
and pushed on top of the stack. Subsequent buffer operations assume that
the userdatum is at the top of the stack, without much checking. The
userdatum is removed from the top of the stack only by luaL_pushresult().
As far as I can tell, this means that once C-code starts using a buffer, it
must call luaL_pushresilt() before returning to Lua, otherwise the
userdatum that may or may not have been placed on top of the stack becomes
its return value. If the buffer manipulation straddles C/Lua boundary, then
things can get even stranger.

In fact, it is unsafe to do any manipulations of the Lua stack between
luaL_buffinit() and luaL_pushresilt(). The following code (with Lua 5.3)
prints 1 and finishes normally:

lua_State *L = luaL_newstate();
luaL_Buffer lb;

lua_pushinteger(L, 1);

luaL_buffinit(L, &lb);
luaL_prepbuffsize(&lb, 5);
luaL_addsize(&lb, 5);

printf("%d\n", (int)lua_tointeger(L, -1));

lua_pushinteger(L, 1);

luaL_prepbuffsize(&lb, 5);
luaL_addsize(&lb, 5);

The following code (with Lua 5.3) prints 0 and then crashes in the last
line:

lua_State *L = luaL_newstate();
luaL_Buffer lb;

lua_pushinteger(L, 1);

luaL_buffinit(L, &lb);
luaL_prepbuffsize(&lb, sizeof lb.initb + 1);
luaL_addsize(&lb, sizeof lb.initb + 1);

printf("%d\n", (int)lua_tointeger(L, -1));

lua_pushinteger(L, 1);

luaL_prepbuffsize(&lb, sizeof lb.initb + 1);

The manual does indicate that what the code above does is invalid, but it
uses fairly mild language that can be easily misunderstood as something
benign:

During its normal operation, a string buffer uses a variable number of
stack slots. So, while using a buffer, you cannot assume that you know
where the top of the stack is. You can use the stack between successive
calls to buffer operations as long as that use is balanced; that is, when
you call a buffer operation, the stack is at the same level it was
immediately after the previous buffer operation.

(end)

I am unsure why such a dangerous facility is present, but I think that its
dangers should be stressed by stronger language. The problem is that the
crash happens only when large strings are stuffed into the buffer, which is
a perfect ingredient for "it works when I test it, but it crashes randomly
in production, and no one knows why".

I think it can also be made less dangerous by using a reference to the
userdatum rather than the top of the stack, but frankly, I think this is
something that should only be used by Lua itself and its public use
deprecated.

Cheers,
V.
Dirk Laurie
2017-11-02 12:53:19 UTC
Permalink
Post by Viacheslav Usov
Post by Marc Balmer
Is it possible to create a luaL_Buffer over several calls to Lua
[...]
Post by Marc Balmer
Or must all buffer operations be finished when I return from C code?
I am unsure why such a dangerous facility is present, but I think that its
dangers should be stressed by stronger language. The problem is that the
crash happens only when large strings are stuffed into the buffer, which is
a perfect ingredient for "it works when I test it, but it crashes randomly
in production, and no one knows why".
It is there for greater efficiency.
Post by Viacheslav Usov
I think it can also be made less dangerous by using a reference to the
userdatum rather than the top of the stack, but frankly, I think this is
something that should only be used by Lua itself and its public use
deprecated.
Oh, that is overstating the case. The public (or at least that section
of it that can write code in the Lua C API) is not so delicate.

What I envisaged in my earlier reply is in fact quite easy, almost
trivial. See attachment. (Lua 5.3).
Viacheslav Usov
2017-11-02 15:42:08 UTC
Permalink
Post by Dirk Laurie
It is there for greater efficiency.
Efficiency of what? Everything that luaL_Buffer and Co does can be done
using the non-auxiliary API. It does not use any magic that is unavailable
to users otherwise.
Post by Dirk Laurie
Oh, that is overstating the case. The public (or at least that section of
it that can write code in the Lua C API) is not so delicate.

In the context of this thread, where the original poster, who is not a Lua
neophyte, wonders whether the use of luaL_Buffer can straddle the C/Lua
interface, this is a very strange statement.
Post by Dirk Laurie
What I envisaged in my earlier reply is in fact quite easy, almost
trivial. See attachment. (Lua 5.3).

And your code is proof enough that you, another experienced Lua user,
misunderstood both the manual and my explanation, because it has exactly
the problem I wrote about and it nicely demonstrates "it works when I test
it, but it crashes randomly
in production, and no one knows why".

Here is my test of your code:

lua_State *L = luaL_newstate();

luaL_openlibs(L);
luaopen_buffer(L);
lua_setglobal(L, "b");
luaL_dostring(L,
"local x = b.new()"
"b.append(x, string.rep('x', 100000))"
"b.append(x, string.rep('x', 100000))"
"print(b.flush(x))"
);

When I run the above, it crashes within the second call to luaL_addvalue(),
due to a heap corruption. If I remove the second b.append... line, then it
crashes within luaL_pushresult(), for the same reason.

Exercise for the reader: explain why.

Hint: set up a breakpoint in boxgc() in the lauxlib.c.

Cheers,
V.
Dirk Laurie
2017-11-03 05:50:54 UTC
Permalink
Post by Viacheslav Usov
Post by Dirk Laurie
It is there for greater efficiency.
Efficiency of what? Everything that luaL_Buffer and Co does can be done
using the non-auxiliary API. It does not use any magic that is unavailable
to users otherwise.
Post by Dirk Laurie
Oh, that is overstating the case. The public (or at least that section of
it that can write code in the Lua C API) is not so delicate.
In the context of this thread, where the original poster, who is not a Lua
neophyte, wonders whether the use of luaL_Buffer can straddle the C/Lua
interface, this is a very strange statement.
If by "C/Lua interface" you include calling Lua from C, then you are
out of my league. You are in effect writing your own Lua interpreter.
Of course extra precautions to conserve stack integrity are needed, as
the manual states.
Post by Viacheslav Usov
Post by Dirk Laurie
What I envisaged in my earlier reply is in fact quite easy, almost
trivial. See attachment. (Lua 5.3).
And your code is proof enough that you, another experienced Lua user,
misunderstood both the manual and my explanation, because it has exactly the
problem I wrote about and it nicely demonstrates "it works when I test it,
but it crashes randomly
in production, and no one knows why".
lua_State *L = luaL_newstate();
luaL_openlibs(L);
luaopen_buffer(L);
lua_setglobal(L, "b");
luaL_dostring(L,
"local x = b.new()"
"b.append(x, string.rep('x', 100000))"
"b.append(x, string.rep('x', 100000))"
"print(b.flush(x))"
);
When I run the above, it crashes within the second call to luaL_addvalue(),
due to a heap corruption. If I remove the second b.append... line, then it
crashes within luaL_pushresult(), for the same reason.
I will be more impressed if you can crash my code by using it as a
module of C routines to be called from Lua.
Dirk Laurie
2017-11-03 06:27:35 UTC
Permalink
Post by Dirk Laurie
Post by Viacheslav Usov
Post by Dirk Laurie
It is there for greater efficiency.
Efficiency of what? Everything that luaL_Buffer and Co does can be done
using the non-auxiliary API. It does not use any magic that is unavailable
to users otherwise.
Post by Dirk Laurie
Oh, that is overstating the case. The public (or at least that section of
it that can write code in the Lua C API) is not so delicate.
In the context of this thread, where the original poster, who is not a Lua
neophyte, wonders whether the use of luaL_Buffer can straddle the C/Lua
interface, this is a very strange statement.
If by "C/Lua interface" you include calling Lua from C, then you are
out of my league. You are in effect writing your own Lua interpreter.
Of course extra precautions to conserve stack integrity are needed, as
the manual states.
Post by Viacheslav Usov
Post by Dirk Laurie
What I envisaged in my earlier reply is in fact quite easy, almost
trivial. See attachment. (Lua 5.3).
And your code is proof enough that you, another experienced Lua user,
misunderstood both the manual and my explanation, because it has exactly the
problem I wrote about and it nicely demonstrates "it works when I test it,
but it crashes randomly
in production, and no one knows why".
lua_State *L = luaL_newstate();
luaL_openlibs(L);
luaopen_buffer(L);
lua_setglobal(L, "b");
luaL_dostring(L,
"local x = b.new()"
"b.append(x, string.rep('x', 100000))"
"b.append(x, string.rep('x', 100000))"
"print(b.flush(x))"
);
When I run the above, it crashes within the second call to luaL_addvalue(),
due to a heap corruption. If I remove the second b.append... line, then it
crashes within luaL_pushresult(), for the same reason.
I will be more impressed if you can crash my code by using it as a
module of C routines to be called from Lua.
OK, I can spare you the effort. If I run your code line by line in the
interpreter (without 'local'), it is fine. If I put it inside do ...
end,
it crashes spectacularly. You are right.

The reason it is in the API must be that it is needed to code the
standard library, which prides itself on being totally written in the
API.
Sean Conner
2017-11-03 06:55:19 UTC
Permalink
Post by Dirk Laurie
Post by Dirk Laurie
Post by Viacheslav Usov
lua_State *L = luaL_newstate();
luaL_openlibs(L);
luaopen_buffer(L);
lua_setglobal(L, "b");
luaL_dostring(L,
"local x = b.new()"
"b.append(x, string.rep('x', 100000))"
"b.append(x, string.rep('x', 100000))"
"print(b.flush(x))"
);
When I run the above, it crashes within the second call to luaL_addvalue(),
due to a heap corruption. If I remove the second b.append... line, then it
crashes within luaL_pushresult(), for the same reason.
I will be more impressed if you can crash my code by using it as a
module of C routines to be called from Lua.
OK, I can spare you the effort. If I run your code line by line in the
interpreter (without 'local'), it is fine. If I put it inside do ...
end,
it crashes spectacularly. You are right.
The reason it is in the API must be that it is needed to code the
standard library, which prides itself on being totally written in the
API.
I got the following when I compiled your code and ran it:

[spc]lucy:/tmp/foo>lua-53
Lua 5.3.4 Copyright (C) 1994-2017 Lua.org, PUC-Rio
Post by Dirk Laurie
b = require "buffer"
x = b.new()
b.append(x, string.rep('x', 100000))
b.append(x, string.rep('x', 100000))
*** glibc detected *** double free or corruption (!prev): 0x098a1ed0 ***
Aborted (core dumped)
[spc]lucy:/tmp/foo>

When I modified the code thusly:

static int buffer_new (lua_State *L) {
luaL_Buffer **buf = lua_newuserdata(L,sizeof(luaL_Buffer *));
*buf = malloc(sizeof(luaL_Buffer));
luaL_buffinit(L,*buf);
luaL_setmetatable(L,"Buffer");
return 1;
}

static int buffer_flush (lua_State *L) {
luaL_Buffer **buf = luaL_checkudata(L,1,"Buffer");
luaL_pushresult(*buf);
free(*buf);
*buf = NULL;
return 1;
}

static int buffer_append (lua_State *L) {
luaL_addvalue(*(luaL_Buffer **)luaL_checkudata(L,1,"Buffer"));
return 0;
}

I got (after the first call to b.append()):
xxxxx
*** glibc detected *** malloc(): memory corruption: 0x08e98da8 ***
Aborted (core dumped)

Meanwhile, this works:

#include <lua.h>
#include <lualib.h>
#include <lauxlib.h>

static int bar(lua_State *L)
{
lua_pushinteger(L,5 * lua_tointeger(L,1) + 2);
return 1;
}

static int baz(lua_State *L)
{
luaL_Buffer buf;
char *p;
int l;
int x;

luaL_buffinit(L,&buf);
luaL_addstring(&buf,"result: ");
lua_getglobal(L,"foo");
lua_call(L,0,1);
x = lua_tointeger(L,-1);
p = luaL_prepbuffsize(&buf,10000);
l = snprintf(p,10000,"%d",x * 2);
luaL_addsize(&buf,l);
luaL_pushresult(&buf);
return 1;
}

int main(void)
{
lua_State *L = luaL_newstate();
luaL_openlibs(L);
lua_pushcfunction(L,bar);
lua_setglobal(L,"bar");
luaL_loadstring(L,"function foo() return bar(3) * 4 + 5 end");
lua_call(L,0,0);
lua_getglobal(L,"print");
lua_pushcfunction(L,baz);
lua_call(L,0,1);
lua_call(L,1,0);
lua_close(L);
return 0;
}

-spc (Which is how I think the luaL_Buffer API is supposed to be used ... )
Viacheslav Usov
2017-11-03 07:28:38 UTC
Permalink
Post by Dirk Laurie
The reason it is in the API must be that it is needed to code the
standard library, which prides itself on being totally written in the API.

As I said (and as the manual says), all of the auxiliary library API can be
implemented with the non-auxiliary API. luaL_Buffer uses nothing but the
standard API as far as I can tell.

There is nothing wrong with it when it is used as intended (as Sean
demonstrates). The problem is the language explaining the intended use is
very mild, and gives people, such as yourself, wild ideas about that, while
the implementation is unforgiving. That's why I put [BUG] in the subject
line.

Cheers,
V.

Loading...