Discussion:
Boxed userdata and garbage collector
Sylvain Fabre
2010-09-02 18:18:43 UTC
Permalink
Hi all,
I am facing currently an issue with the mix of "boxed userdata" and the
garbage collector in one of our tools :

* Our userdatas exchanged between LUA and C are full-userdata, but
"boxed" : we allocate a newuserdata with the size of a pointer,
and we set the pointer to an internallly allocated structure.
* Those internal structures are handling pictures, leading to
allocation of bigs memory chunks.
* But, from the LUA point of view, we allocate only a pointer (ie 4
or 8 bytes typically) : so the memory computed by the garbage
collector is very low
* As a consequence, we run out of memory very quickly and we are
obliged to tweak the values of the PAUSE/MUL in the gc, wich is
not completly satisfactory.

I would like to know if this issue has already been adressed by someone
somewhere ?
If not, i would suggest to add (in lua 5.3 :) ? ) a way to allow the
increase/decrease of the memory counter used in the garbage collector
from the "external world" (either with a new lua_gc method or with a
lua_newuserboxeddata).
Any suggestion or ideas on this ?

Thanks, S. FABRE
Florian Weimer
2010-09-02 19:02:47 UTC
Permalink
Post by Sylvain Fabre
If not, i would suggest to add (in lua 5.3 :) ? ) a way to allow the
increase/decrease of the memory counter used in the garbage collector
from the "external world" (either with a new lua_gc method or with a
lua_newuserboxeddata).
Any suggestion or ideas on this ?
Have you tried adding a function which frees the picture memory to
your API? That would mean that you could sustain a higher picture
allocation/deallocation rate independently of the garbage collector
implementation. For such large and comparatively rare objects, the
additional burden on the programmer does not seem that large.

Obviously, it depends a lot on your environment if this is feasible.
Sylvain Fabre
2010-09-02 19:14:52 UTC
Permalink
This is an option too, but in that case, i need to know wich objects
are marked as 'free' from the LUA engine. An right now, i do not know if
there is a way to achieve this.

Rgds, Sylvain.
Post by Florian Weimer
Post by Sylvain Fabre
If not, i would suggest to add (in lua 5.3 :) ? ) a way to allow the
increase/decrease of the memory counter used in the garbage collector
from the "external world" (either with a new lua_gc method or with a
lua_newuserboxeddata).
Any suggestion or ideas on this ?
Have you tried adding a function which frees the picture memory to
your API? That would mean that you could sustain a higher picture
allocation/deallocation rate independently of the garbage collector
implementation. For such large and comparatively rare objects, the
additional burden on the programmer does not seem that large.
Obviously, it depends a lot on your environment if this is feasible.
--
=================================
Sylvain FABRE
***@inpixal.com
Fixe: 09 72 11 30 24
Mobile: 06 30 12 72 34
Fax : 09 72 11 10 71
=================================
Jonathan Castello
2010-09-02 19:49:08 UTC
Permalink
On Thu, Sep 2, 2010 at 12:14 PM, Sylvain Fabre
 This is an option too, but in that case, i need to know wich objects are
marked as 'free' from the LUA engine. An right now, i do not know if there
is a way to achieve this.
Have you looked into the __gc metamethod that you can use on full
userdata? It's called when the Lua GC is collecting that userdatum,
and it's good for releasing attached memory in C, decrementing
refcounts, closing file descriptors, and so on.

~Jonathan
Sylvain Fabre
2010-09-02 20:03:03 UTC
Permalink
Yes of course :) All our objects have a __gc metamethod implemented !
But the core issue is that for each object allocated internally, onlye 4
bytes/8bytes are seen from the LUA garbage collector... But internally,
megaoctets are allocated...
Hence, the garbage collector does not trigger "properly", and we ran out
of memory.

Currently, i strongly constrain the gc with the mul factor, but this is
a dirty workaround IMHO.
This is why i suggest to add a way to indicate to the gc engine the real
size of memory allocated for a userdata (and not not only the size
indicated in the lua_newuserdata function)
Post by Jonathan Castello
On Thu, Sep 2, 2010 at 12:14 PM, Sylvain Fabre
This is an option too, but in that case, i need to know wich objects are
marked as 'free' from the LUA engine. An right now, i do not know if there
is a way to achieve this.
Have you looked into the __gc metamethod that you can use on full
userdata? It's called when the Lua GC is collecting that userdatum,
and it's good for releasing attached memory in C, decrementing
refcounts, closing file descriptors, and so on.
~Jonathan
--
=================================
Sylvain FABRE
***@inpixal.com
Fixe: 09 72 11 30 24
Mobile: 06 30 12 72 34
Fax : 09 72 11 10 71
=================================
Jonathan Castello
2010-09-02 20:09:28 UTC
Permalink
 Yes of course :) All our objects have a __gc metamethod implemented !
But the core issue is that for each object allocated internally, onlye 4
bytes/8bytes are seen from the LUA garbage collector... But internally,
megaoctets are allocated...
Hence, the garbage collector does not trigger "properly", and we ran out of
memory.
What if you allocate the entire space required by your object as a
userdatum, and use placement new to initialize it into the space Lua
allocated for you? Then you're not using a pointer, you're using the
whole object.

~Jonathan
Jonathan Castello
2010-09-02 20:11:41 UTC
Permalink
Post by Jonathan Castello
 Yes of course :) All our objects have a __gc metamethod implemented !
But the core issue is that for each object allocated internally, onlye 4
bytes/8bytes are seen from the LUA garbage collector... But internally,
megaoctets are allocated...
Hence, the garbage collector does not trigger "properly", and we ran out of
memory.
What if you allocate the entire space required by your object as a
userdatum, and use placement new to initialize it into the space Lua
allocated for you? Then you're not using a pointer, you're using the
whole object.
~Jonathan
Of course, if you're not using C++ you wouldn't use placement new... *cough*

~Jonathan
Sylvain Fabre
2010-09-02 20:20:55 UTC
Permalink
That's what i was goign to anwser :) We are using C...
Post by Jonathan Castello
Post by Jonathan Castello
Post by Sylvain Fabre
Yes of course :) All our objects have a __gc metamethod implemented !
But the core issue is that for each object allocated internally, onlye 4
bytes/8bytes are seen from the LUA garbage collector... But internally,
megaoctets are allocated...
Hence, the garbage collector does not trigger "properly", and we ran out of
memory.
What if you allocate the entire space required by your object as a
userdatum, and use placement new to initialize it into the space Lua
allocated for you? Then you're not using a pointer, you're using the
whole object.
~Jonathan
Of course, if you're not using C++ you wouldn't use placement new... *cough*
~Jonathan
--
=================================
Sylvain FABRE
***@inpixal.com
Fixe: 09 72 11 30 24
Mobile: 06 30 12 72 34
Fax : 09 72 11 10 71
=================================
Ted Unangst
2010-09-02 20:27:37 UTC
Permalink
Post by Jonathan Castello
 Yes of course :) All our objects have a __gc metamethod implemented !
But the core issue is that for each object allocated internally, onlye 4
bytes/8bytes are seen from the LUA garbage collector... But internally,
megaoctets are allocated...
Hence, the garbage collector does not trigger "properly", and we ran out of
memory.
What if you allocate the entire space required by your object as a
userdatum, and use placement new to initialize it into the space Lua
allocated for you? Then you're not using a pointer, you're using the
whole object.
What happens when you resize an image? You need to allocate more
memory. You can't modify all the Lua references to the old image, so
you need the Lua refs to be shallow pointers to the real data. And
allocating the real data as userdata too, and then hanging a reference
to it, gets to be a pain. You could make all operations return new
images, but that's quite inefficient.

I was able to work around the problem with explicit close() methods,
and also ended up not creating many images, but that's not viable for
everyone.

I think many languages offer some sort of "inflate" interface to the
garbage collector to either specify the real size of an object, or at
least indicate overall memory pressure.
Jonathan Castello
2010-09-02 20:30:29 UTC
Permalink
What happens when you resize an image?  You need to allocate more
memory.  You can't modify all the Lua references to the old image, so
you need the Lua refs to be shallow pointers to the real data.  And
allocating the real data as userdata too, and then hanging a reference
to it, gets to be a pain.  You could make all operations return new
images, but that's quite inefficient.
Hmm, see, since the OP said that they were using a "structure", I
assumed the userdatum was not the image itself but rather a structure
containing a pointer to an image.

If I was wrong I apologize.

~Jonathan
Ted Unangst
2010-09-02 20:37:52 UTC
Permalink
Post by Jonathan Castello
What happens when you resize an image?  You need to allocate more
memory.  You can't modify all the Lua references to the old image, so
you need the Lua refs to be shallow pointers to the real data.  And
allocating the real data as userdata too, and then hanging a reference
to it, gets to be a pain.  You could make all operations return new
images, but that's quite inefficient.
Hmm, see, since the OP said that they were using a "structure", I
assumed the userdatum was not the image itself but rather a structure
containing a pointer to an image.
That's exactly the interpretation I had. The idea is that the current
shallow userdatum allows swapping out the heavy data. Moving the
heavy data into the userdatum has a number of drawbacks.
Peter Cawley
2010-09-02 20:33:30 UTC
Permalink
And allocating the real data as userdata too, and then hanging a reference
to it, gets to be a pain.
I don't see why it should be any more painful than normal memory
management in C, using something like the following:

void* malloc_lua(size_t sz, lua_State* L)
{
void* p = lua_newuserdata(L, sz);
lua_pushlightuserdata(L, p);
lua_insert(L, -2);
lua_settable(L, LUA_REGISTRYINDEX);
return p;
}

void free_lua(void* p, lua_State* L)
{
lua_pushlightuserdata(L, p);
lua_pushnil(L);
lua_settable(L, LUA_REGISTRYINDEX);
}
Ted Unangst
2010-09-02 20:40:27 UTC
Permalink
Post by Peter Cawley
And allocating the real data as userdata too, and then hanging a reference
to it, gets to be a pain.
I don't see why it should be any more painful than normal memory
Unless I have vastly misunderstood your proposal, these user data
would never be garbage collected. Explicit frees in Lua are no more
painful than explicit frees in C, but I'd prefer to let the garbage
collector do its job.
Post by Peter Cawley
void* malloc_lua(size_t sz, lua_State* L)
{
 void* p = lua_newuserdata(L, sz);
 lua_pushlightuserdata(L, p);
 lua_insert(L, -2);
 lua_settable(L, LUA_REGISTRYINDEX);
 return p;
}
void free_lua(void* p, lua_State* L)
{
 lua_pushlightuserdata(L, p);
 lua_pushnil(L);
 lua_settable(L, LUA_REGISTRYINDEX);
}
Peter Cawley
2010-09-02 20:47:41 UTC
Permalink
Post by Ted Unangst
Post by Peter Cawley
And allocating the real data as userdata too, and then hanging a reference
to it, gets to be a pain.
I don't see why it should be any more painful than normal memory
Unless I have vastly misunderstood your proposal, these user data
would never be garbage collected.  Explicit frees in Lua are no more
painful than explicit frees in C, but I'd prefer to let the garbage
collector do its job.
Post by Peter Cawley
void* malloc_lua(size_t sz, lua_State* L)
{
 void* p = lua_newuserdata(L, sz);
 lua_pushlightuserdata(L, p);
 lua_insert(L, -2);
 lua_settable(L, LUA_REGISTRYINDEX);
 return p;
}
void free_lua(void* p, lua_State* L)
{
 lua_pushlightuserdata(L, p);
 lua_pushnil(L);
 lua_settable(L, LUA_REGISTRYINDEX);
}
My proposal is that the current image-related code is taken, and then
the calls to malloc/free which deal with the large buffer are replaced
with calls to some generic allocator. Then the above Lua-based
allocator is used as the generic allocator, meaning that the garbage
collector is aware that there is this large block of memory floating
around. This causes the Lua GC to run collections at better intervals,
which should cause the proper image userdata to be collected. In turn,
the __gc method on the proper userdata will eventually call the
generic free function on its large buffer, which calls the lua
allocator free function, which causes the large buffer to be garbage
collected in the next cycle.
Ted Unangst
2010-09-02 20:58:14 UTC
Permalink
Post by Peter Cawley
My proposal is that the current image-related code is taken, and then
the calls to malloc/free which deal with the large buffer are replaced
with calls to some generic allocator. Then the above Lua-based
allocator is used as the generic allocator, meaning that the garbage
collector is aware that there is this large block of memory floating
around. This causes the Lua GC to run collections at better intervals,
which should cause the proper image userdata to be collected. In turn,
the __gc method on the proper userdata will eventually call the
generic free function on its large buffer, which calls the lua
allocator free function, which causes the large buffer to be garbage
collected in the next cycle.
Very interesting. Thanks, that's a consequence I hadn't thought of.
Michal Kottman
2010-09-02 21:12:31 UTC
Permalink
Post by Ted Unangst
Post by Peter Cawley
My proposal is that the current image-related code is taken, and then
the calls to malloc/free which deal with the large buffer are replaced
with calls to some generic allocator. Then the above Lua-based
allocator is used as the generic allocator...
Very interesting. Thanks, that's a consequence I hadn't thought of.
I have a similar problem - I am currently creating a binding for the
OpenCV library. This solution cannot be applied in my case, because I
cannot (and do not want to) modify the library. A common pattern in
OpenCV is this (this is how I imagine the binding):

while not finished do
local img = cv.RetrieveFrame(cam) -- retrieves image from cam
-- do some processing
end

A full color image takes about 230K, and imagine 15fps - that is a lot
of data. Yes, I could force the end user to call img:Release() after
processing, but I want to spare him this step.

Since I cannot force the library to use my custom allocator (and the
image is allocated inside the library), I have to use boxed userdata,
containing a pointer to the image - and Lua sees that as only 4/8 bytes,
which may cause memory problems after some time.
Wesley Smith
2010-09-03 01:02:30 UTC
Permalink
Post by Michal Kottman
while not finished do
local img = cv.RetrieveFrame(cam) -- retrieves image from cam
-- do some processing
end
A full color image takes about 230K, and imagine 15fps - that is a lot
of data. Yes, I could force the end user to call img:Release() after
processing, but I want to spare him this step.
Since I cannot force the library to use my custom allocator (and the
image is allocated inside the library), I have to use boxed userdata,
containing a pointer to the image - and Lua sees that as only 4/8 bytes,
which may cause memory problems after some time.
I've been doing a lot of work with video processing and lua for the past few years and have never had memory problems. Basically you don't want to return a new userdata on each frame, but reuse it and the memory. That way you only have a minimal amount to free.

wes
Sylvain Fabre
2010-09-03 05:51:49 UTC
Permalink
I am facing in fact the same situation than with OpenCV : i would like
to avoid modifying the C library itself.
Up the now, i think that the easiest solution (even if not 100% perfect)
is the proposal of Roberto (ie calling the lua_gc(STEP, x) during
picture allocation to 'speedup' the gc).
At the end, and after reading all your contributions, i think that the
perfect solution still remains the integration of 2 new actions in the
lua_gc call, allowing to increase/decrease externally the LUA internal
memory counter.

Anyway, thansk for all your contributions, it's clearly very helpful.
Post by Michal Kottman
Post by Ted Unangst
Post by Peter Cawley
My proposal is that the current image-related code is taken, and then
the calls to malloc/free which deal with the large buffer are replaced
with calls to some generic allocator. Then the above Lua-based
allocator is used as the generic allocator...
Very interesting. Thanks, that's a consequence I hadn't thought of.
I have a similar problem - I am currently creating a binding for the
OpenCV library. This solution cannot be applied in my case, because I
cannot (and do not want to) modify the library. A common pattern in
while not finished do
local img = cv.RetrieveFrame(cam) -- retrieves image from cam
-- do some processing
end
A full color image takes about 230K, and imagine 15fps - that is a lot
of data. Yes, I could force the end user to call img:Release() after
processing, but I want to spare him this step.
Since I cannot force the library to use my custom allocator (and the
image is allocated inside the library), I have to use boxed userdata,
containing a pointer to the image - and Lua sees that as only 4/8 bytes,
which may cause memory problems after some time.
--
=================================
Sylvain FABRE
***@inpixal.com
Fixe: 09 72 11 30 24
Mobile: 06 30 12 72 34
Fax : 09 72 11 10 71
=================================
Henk Boom
2010-09-03 06:12:27 UTC
Permalink
Post by Sylvain Fabre
At the end, and after reading all your contributions, i think that the
perfect solution still remains the integration of 2 new actions in the
lua_gc call, allowing to increase/decrease externally the LUA internal
memory counter.
Maybe the ability to set both the real and 'apparent' size when
allocating a udata would be easier to manage. That way you wouldn't
need to manually decrease the counter on collection.

henk
Roberto Ierusalimschy
2010-09-03 13:39:50 UTC
Permalink
Post by Sylvain Fabre
I am facing in fact the same situation than with OpenCV : i would
like to avoid modifying the C library itself.
Up the now, i think that the easiest solution (even if not 100%
perfect) is the proposal of Roberto (ie calling the lua_gc(STEP, x)
during picture allocation to 'speedup' the gc).
At the end, and after reading all your contributions, i think that
the perfect solution still remains the integration of 2 new actions
in the lua_gc call, allowing to increase/decrease externally the LUA
internal memory counter.
With the current incremental garbage collector, the collector is moved
forward by allocations, that is, deltas in 'totalbytes', not by its
absolute value. (The only effect that the absolute value of 'totalbytes'
has on the collector speed is to make the collector slower, increasing
the pause between collection cycles.)

So, to mimic the effect of an external allocation as if it were
internal, lua_gc(STEP, x) seems to be more than enough. I do not
think there is a need for extra API.

-- Roberto
Duncan Cross
2010-09-03 18:10:33 UTC
Permalink
On Fri, Sep 3, 2010 at 2:39 PM, Roberto Ierusalimschy
Post by Roberto Ierusalimschy
With the current incremental garbage collector, the collector is moved
forward by allocations, that is, deltas in 'totalbytes', not by its
absolute value. (The only effect that the absolute value of 'totalbytes'
has on the collector speed is to make the collector slower, increasing
the pause between collection cycles.)
So, to mimic the effect of an external allocation as if it were
internal, lua_gc(STEP, x) seems to be more than enough.
That's interesting, thank you for the clarification. Maybe it would be
a good idea to add a note about this to the documentation for
lua_newuserdata(), or somewhere in the next version of the manual/PiL.

-Duncan
Florian Weimer
2010-09-04 09:48:59 UTC
Permalink
Post by Michal Kottman
I have a similar problem - I am currently creating a binding for the
OpenCV library. This solution cannot be applied in my case, because I
cannot (and do not want to) modify the library. A common pattern in
while not finished do
local img = cv.RetrieveFrame(cam) -- retrieves image from cam
-- do some processing
end
Would it be possible to write this instead?

while not finished do
local img = cv.RetrieveFrame(cam) -- retrieves image from cam
-- do some processing
img:release()
end

This doesn't seem to be too onerous to me.

(In exception handlers, you should perform full collections if you
follow this style, but that shouldn't result in performance problems
as long as exceptions are rare.)
Peter Cawley
2010-09-02 20:11:19 UTC
Permalink
 Yes of course :) All our objects have a __gc metamethod implemented !
But the core issue is that for each object allocated internally, onlye 4
bytes/8bytes are seen from the LUA garbage collector... But internally,
megaoctets are allocated...
Hence, the garbage collector does not trigger "properly", and we ran out of
memory.
The obvious solution to me is to allow the internally used allocator
to be changed, and allocate those megaoctects as userdata themselves
(with said userdata not actually being exposed to any Lua script, nor
having any fancy metatable).
Sylvain Fabre
2010-09-02 20:23:09 UTC
Permalink
Thanks for the idea, i will try it !
The point i just want to check is that the use of lua_newuserdata is not
too slow compared to usual "malloc-type" allocators.
Post by Peter Cawley
Post by Sylvain Fabre
Yes of course :) All our objects have a __gc metamethod implemented !
But the core issue is that for each object allocated internally, onlye 4
bytes/8bytes are seen from the LUA garbage collector... But internally,
megaoctets are allocated...
Hence, the garbage collector does not trigger "properly", and we ran out of
memory.
The obvious solution to me is to allow the internally used allocator
to be changed, and allocate those megaoctects as userdata themselves
(with said userdata not actually being exposed to any Lua script, nor
having any fancy metatable).
--
=================================
Sylvain FABRE
***@inpixal.com
Fixe: 09 72 11 30 24
Mobile: 06 30 12 72 34
Fax : 09 72 11 10 71
=================================
Alex Queiroz
2010-09-02 20:25:07 UTC
Permalink
Hallo,
 Thanks for the idea, i will try it !
The point i just want to check is that the use of lua_newuserdata is not too
slow compared to usual "malloc-type" allocators.
You can replace the memory allocator used by Lua too.
--
-alex
http://www.artisancoder.com/
Ted Unangst
2010-09-02 20:30:55 UTC
Permalink
Post by Peter Cawley
The obvious solution to me is to allow the internally used allocator
to be changed, and allocate those megaoctects as userdata themselves
(with said userdata not actually being exposed to any Lua script, nor
having any fancy metatable).
If the userdata is truly not exposed to Lua, it will be collected very
quickly. The effects of this are quite undesirable. :)
Peter Cawley
2010-09-02 20:35:19 UTC
Permalink
Post by Ted Unangst
Post by Peter Cawley
The obvious solution to me is to allow the internally used allocator
to be changed, and allocate those megaoctects as userdata themselves
(with said userdata not actually being exposed to any Lua script, nor
having any fancy metatable).
If the userdata is truly not exposed to Lua, it will be collected very
quickly.  The effects of this are quite undesirable. :)
Well obviously you have to put it somewhere, just not somewhere
visible. The registry or the environment of the enclosing userdata box
are obvious candidates.
Roberto Ierusalimschy
2010-09-02 20:06:38 UTC
Permalink
Post by Sylvain Fabre
* But, from the LUA point of view, we allocate only a pointer (ie 4
or 8 bytes typically) : so the memory computed by the garbage
collector is very low
Have you tried to call "collectgarbage('step', x)" (for some
reasonable 'x') each time you allocate a new picture?

-- Roberto
Sylvain Fabre
2010-09-02 20:23:41 UTC
Permalink
Thanks Roberto,
I did not try this, i will test it tomorrow.

Thanks, Sylvain.
Post by Roberto Ierusalimschy
Post by Sylvain Fabre
* But, from the LUA point of view, we allocate only a pointer (ie 4
or 8 bytes typically) : so the memory computed by the garbage
collector is very low
Have you tried to call "collectgarbage('step', x)" (for some
reasonable 'x') each time you allocate a new picture?
-- Roberto
--
=================================
Sylvain FABRE
***@inpixal.com
Fixe: 09 72 11 30 24
Mobile: 06 30 12 72 34
Fax : 09 72 11 10 71
=================================
Francesco Abbate
2010-09-03 06:53:27 UTC
Permalink
Hi Sylvain,

I think that since you want that the lua GC manage the allocated
memory you should allocate the memory with lua. You can do that very
easily in C++ with placement new and explicit destructor call or in C
by using the lua_newuserdata instead of malloc to allocate the whole
buffer.

I'm using this techniques both with C and C++ code and it works like a
charm, it is more simple and you reduce memory fragmentation.

I hope that helps.

Francesco
Mark Hamburg
2010-09-03 17:49:58 UTC
Permalink
See http://lua-users.org/lists/lua-l/2007-06/msg00088.html

Lightroom uses the call collectgarbage( 'step', x ) for some appropriate
value of x when working with images, but setting x is tricky. So, I
recently patched Lua as referenced above and have added an extraspace
field to our proxies together with an API to adjust the extra space
associated with the proxy. I haven't had things running long enough to
report on how well or even if it works, but it seems promising as a way
to keep Lua better informed of the memory in actual use.

It seems like this could be a good addition to 5.2.

Mark
Javier Guerra Giraldez
2010-09-03 18:03:42 UTC
Permalink
Post by Mark Hamburg
It seems like this could be a good addition to 5.2.
+1
--
Javier
Roberto Ierusalimschy
2010-09-03 20:29:51 UTC
Permalink
Post by Mark Hamburg
See http://lua-users.org/lists/lua-l/2007-06/msg00088.html
Lightroom uses the call collectgarbage( 'step', x ) for some appropriate
value of x when working with images, but setting x is tricky. So, I
recently patched Lua as referenced above and have added an extraspace
field to our proxies together with an API to adjust the extra space
associated with the proxy. I haven't had things running long enough to
report on how well or even if it works, but it seems promising as a way
to keep Lua better informed of the memory in actual use.
It seems like this could be a good addition to 5.2.
As I explained in another message, it seems that keeping Lua better
informed of the memory in actual use seems to have a negligible impact
on the collector (except for making it run a little slower). Unless
someone gives a good argument that this undestanding is wrong, I cannot
see why it would be a good addition to 5.2.

In Lua 5.2, on the other hand, the 'x' in collectgarbage( 'step', x )
will correspond exactly to the work the collector does when allocating
1KB of memory, so it should not be that tricky to set it.

-- Roberto
Mark Hamburg
2010-09-05 17:00:46 UTC
Permalink
Post by Roberto Ierusalimschy
Post by Mark Hamburg
See http://lua-users.org/lists/lua-l/2007-06/msg00088.html
Lightroom uses the call collectgarbage( 'step', x ) for some appropriate
value of x when working with images, but setting x is tricky. So, I
recently patched Lua as referenced above and have added an extraspace
field to our proxies together with an API to adjust the extra space
associated with the proxy. I haven't had things running long enough to
report on how well or even if it works, but it seems promising as a way
to keep Lua better informed of the memory in actual use.
It seems like this could be a good addition to 5.2.
As I explained in another message, it seems that keeping Lua better
informed of the memory in actual use seems to have a negligible impact
on the collector (except for making it run a little slower). Unless
someone gives a good argument that this undestanding is wrong, I cannot
see why it would be a good addition to 5.2.
In Lua 5.2, on the other hand, the 'x' in collectgarbage( 'step', x )
will correspond exactly to the work the collector does when allocating
1KB of memory, so it should not be that tricky to set it.
The approach of calling 'step' when allocating the image (with an appropriate definition for 'step') seems to be conceptually equivalent to telling Lua about extra space in use which should drive the GC forward. What's lost is the ability to say, "but I manually released that space, so we can turn the pressure back down".

For example, say I've got code that generates a series of images. Sometimes those images are dispersed through the application, for example as previews, and will live on until no longer referenced and they get collected. (In fact, they may live in weak-table based caches.) At other times, they won't propagate and will get explicitly closed. This could happen when doing something like tracking a slider for an adjustment. Ideally, the code that built the image proxy should also tell the GC about the space in use by the image. But that then means that both use cases are treated identically and there is no way to get credit for the prompt closure of images. I could address this by replacing the close logic with a recycle path, but that both requires deeper plumbing to loop the recycled image back and potentially loses the benefit of having the closed image serve as an error indicator if references to the image live on (or we end up plumbing through the recycle path AND sending back a new image anyway).

Now, it may be that the way totalbytes works with the GC in 5.1 made this not really work as expected either, but on the surface it seems sensible. Memory pressure goes up and the GC should run faster. Memory pressure comes down and it should run more slowly.

Mark
Juris Kalnins
2010-09-06 09:22:40 UTC
Permalink
Post by Mark Hamburg
Post by Roberto Ierusalimschy
In Lua 5.2, on the other hand, the 'x' in collectgarbage( 'step', x )
will correspond exactly to the work the collector does when allocating
1KB of memory, so it should not be that tricky to set it.
[Assuming it's typo and "x kB" was meant]
Post by Mark Hamburg
The approach of calling 'step' when allocating the image (with an
appropriate definition for 'step') seems to be conceptually equivalent
to telling Lua about extra space in use which should drive the GC
forward. What's lost is the ability to say, "but I manually released
that space, so we can turn the pressure back down".
Oh, but there's no "pressure". Quite the opposite, the more memory you
have allocated, the more GC is relaxing. (See the definition
of stddebt in lgc.c)

Lua GC measures the amount of work it has to do in an abstract
units of work (GCthisCOST and GCthatCOST), and does about
gcstepmul units of work per one step. Every kilobyte you allocate
will buy you gcstepmul units of work from GC.

Cost of sweeping (freeing) data does not depend on it's size.

When GC finishes one cycle, it goes on a vacation, determined
by the gcpause value.

collectgarbage( 'step', x ) will run step x times, but not
beyond the end of current cycle. If it was near the end, it will
do just a little, and enter pause.

Nothing it all that logic depends on the size of allocated data,
except for the length of the pause.
Juris Kalnins
2010-09-06 10:11:45 UTC
Permalink
Post by Juris Kalnins
collectgarbage( 'step', x ) will run step x times, but not
beyond the end of current cycle. If it was near the end, it will
do just a little, and enter pause.
Oh, and another thing. Unless you reach the end of cycle
(collectgarbage returns true), running collectgarbage( 'step', x )
will increase your debt by x kilobytes. So you have not tricked
GC to do any additional work, just made it do it a bit earlier.

Loading...