Discussion:
Reentrant GC
Soni L.
2018-11-27 02:52:44 UTC
Permalink
Has anyone made a reentrant Lua GC, to allow garbage collection within __gc
metamethod?

I could really use this, but it doesn't seem to be a thing.
云风 Cloud Wu
2018-11-27 03:10:19 UTC
Permalink
Has anyone made a reentrant Lua GC, to allow garbage collection within __gc metamethod?
I could really use this, but it doesn't seem to be a thing.
I think you can minimize the __gc metamethod , just putting the dead
object to a table in __gc . And call the destructor of objects by
iterating this table in main loop.
--
http://blog.codingnow.com
Coda Highland
2018-11-27 03:51:57 UTC
Permalink
Has anyone made a reentrant Lua GC, to allow garbage collection within __gc metamethod?
I could really use this, but it doesn't seem to be a thing.
That's not the first thing I think of when I hear "reentrant." Lua's
GC *is* reentrant by the definition I usually use, which is to say
that multiple threads can GC distinct Lua states concurrently.

Reentrancy by the notion you suggest sounds pretty tricky. I can
envision how to do it if I were to build a GC from scratch, but given
how much of a niche use case it is I'm not sure if it would be worth
the effort.

What's your use case? Why can't you do something like queue up an
action inside your __gc metamethod that you can process after the
sweep has ended?

/s/ Adam
Philippe Verdy
2018-11-27 04:54:30 UTC
Permalink
That's a good approach, given that GC performs naturally a loop of cycles
that allows a object to not be swept immediately but collected again in the
next cycle (after a new collect phase).

I think that what Sony L. wants is to be able to GC recursively within the
same thread, so that a __gc metamethod can call any other function that
could allocate new objects (so that it would potentially call a GC; however
this call is blocked by the fact the thread has already has an active
sweeping phase, so any attempt to perform another GC does nothing and the
GC will have to wait for the next cycle).

I can imagine that there are cases wher such blocking may cause memory to
never be deallocated beause each GC actually will increase the number of
objects allocated at each cycle without being able to free any one.

This is not really lack of "reentrance" but lack of "recursivity" (which is
already blocked silently as soon as the GC is invoked) which may cause
problem (only if your __gc metamethods are doing very complex things where
it would require allocating new additional objects that also need such __gc
metamethod for their own finalization).

Reentrance on the opposite should not be a problem (unless there's some
communication/synchronization mechanism between threads, that forces one
thread to wait for the completion of the GC of another thread, in which
case you would get a deadlock where one of the thread would have their GC
to be blocked and doing then nothing in their own GC cycle, all being
delayed for the next cycle).

It's not easy to control that recursivity only during the finalization of
objects (in the __gc metamethods), without the help of a background thread
dedicated to control GC in all threads (I note for example that Java,
Javascript, and other languages perform GC only via a dedicated thread to
serialize things correctly; and I see similar things at OS level for their
OS-level threads; and as well there's a dedicated background process in the
OS dedicated to managing the global heaps needed and used by all other
processes, or system drivers; may be this is caused by the complex problem
of recursion: instead of recursing a thread (or process) just is blocked by
serializing into the system helper thread or process; this can caus a
process or thread to freeze temporarily all other processes or threads).

On a single-tasking system where there's no multithreading or
multiprocessing, but only cooperative coroutines, the GC becomes blocking
if ever there's a need for recursion (but recursion is not possible from
within the dedicated coroutine) and the application is naturally blocked as
another coroutine which is not being resumed before the mark/sweep phase is
complete and finalization has been properly applied without being delayed
indefinitely to a never ending loop of GC cycles (which could cause
significant CPU processing time without ever entering the thread in a idle
cycle, except by forced pauses to do nothing).

There's not a lot of programs that use __gc metamethods for finalization.
Basically most uses is for terminating I/O and freeing resources when I/O
completes (i.e. it is for emulating async I/O, e.g. for network sockets,
which don't necessarily run in a separate thread but in a coroutine of the
same thread): for such use, there's normally no need to allocate new memory
resources, iut's enough to "free" them by removing a reference, that will
then cause the objects to be collected and swept in the next cycle without
needing additional I/O, so the finalization is much simpler and can be
delayed safely without creating infinite loops.

If your __gc finlization metamethods makes more complex things requiring
allocation of new resources, in my opinion the program has a design bug:
these resources needed to free objects should have been allocated wit hthe
object long before it gets dereferenced and then garbage collected for
finalization. Finalization should not allocate memory except for very
simple objects that have NO such __gc finalization routine (such as strings
or simple preallocated buffers/indexed arrays).

Such program can be modified to use a message loop based on timers to
schedule worker coroutines for its actual work, and another helper
coroutine for performing explicit GC cycles on objects whose finalization
has been delayed and will be managed externally (not in the worker
coroutines themselves): this emulates pseudo-threads (like those in old
versions of Windows with non-reemptive kernels and cooperative programs via
message loops and some priority lists).

True multithreading in Lua is not easy: we can create threads but there's
little way to synchronize them cleanly (except by using external I/O): we
don't have things like mutexes, and no critical sections for atomic
operations for creating these mutexes or controlling accesses to shared
variables and communciation buffers. And there's no scheduler we can
control (all threads created in Lua are equal). Only coroutines in the same
thread are easily controlable. Threads were added mostly to support
applications on servers communicating with many independant clients (where
no client can control what the other client does, each client having its
own resources that can be freed at once in a single operation where all
objects of the thred are instantly marked for finalization and finalization
will then run in a tight loop).
Post by Soni L.
Has anyone made a reentrant Lua GC, to allow garbage collection within
__gc metamethod?
Post by Soni L.
I could really use this, but it doesn't seem to be a thing.
That's not the first thing I think of when I hear "reentrant." Lua's
GC *is* reentrant by the definition I usually use, which is to say
that multiple threads can GC distinct Lua states concurrently.
Reentrancy by the notion you suggest sounds pretty tricky. I can
envision how to do it if I were to build a GC from scratch, but given
how much of a niche use case it is I'm not sure if it would be worth
the effort.
What's your use case? Why can't you do something like queue up an
action inside your __gc metamethod that you can process after the
sweep has ended?
/s/ Adam
Tim Hill
2018-11-27 06:31:36 UTC
Permalink
If your __gc finlization metamethods makes more complex things requiring allocation of new resources, in my opinion the program has a design bug: these resources needed to free objects should have been allocated wit hthe object long before it gets dereferenced and then garbage collected for finalization. Finalization should not allocate memory except for very simple objects that have NO such __gc finalization routine (such as strings or simple preallocated buffers/indexed arrays).
Exactly. It seems to me the OP is concerned about temporary allocations during the __gc metamethod not being cleaned up until the next GC cycle. Well, welcome to the world of GC!

—Tim
Soni L.
2018-11-27 09:29:57 UTC
Permalink
I am concerned about an attacker setting a __gc metamethod that loops
forever and can't be broken.
Post by Philippe Verdy
If your __gc finlization metamethods makes more complex things requiring
these resources needed to free objects should have been allocated wit hthe
object long before it gets dereferenced and then garbage collected for
finalization. Finalization should not allocate memory except for very
simple objects that have NO such __gc finalization routine (such as strings
or simple preallocated buffers/indexed arrays).
Exactly. It seems to me the OP is concerned about temporary allocations
during the __gc metamethod not being cleaned up until the next GC cycle.
Well, welcome to the world of GC!
—Tim
Philippe Verdy
2018-11-27 10:29:51 UTC
Permalink
There's a way to restrict that infinite loop so that the __gc metamorphic
will not be systematically be called at each cycle but only after a growing
delay (exponential growth) has passed. This will rapidly stop the infinite
loop even if these objects get never finalized, and at least this will
avoid consuming lot of CPU resources.
Post by Soni L.
I am concerned about an attacker setting a __gc metamethod that loops
forever and can't be broken.
Post by Philippe Verdy
If your __gc finlization metamethods makes more complex things requiring
these resources needed to free objects should have been allocated wit hthe
object long before it gets dereferenced and then garbage collected for
finalization. Finalization should not allocate memory except for very
simple objects that have NO such __gc finalization routine (such as strings
or simple preallocated buffers/indexed arrays).
Exactly. It seems to me the OP is concerned about temporary allocations
during the __gc metamethod not being cleaned up until the next GC cycle.
Well, welcome to the world of GC!
—Tim
Philippe Verdy
2018-11-27 10:31:08 UTC
Permalink
Metafunction, not metamorphic. Damned Android substitution...
Post by Philippe Verdy
There's a way to restrict that infinite loop so that the __gc metamorphic
will not be systematically be called at each cycle but only after a growing
delay (exponential growth) has passed. This will rapidly stop the infinite
loop even if these objects get never finalized, and at least this will
avoid consuming lot of CPU resources.
Post by Soni L.
I am concerned about an attacker setting a __gc metamethod that loops
forever and can't be broken.
Post by Philippe Verdy
If your __gc finlization metamethods makes more complex things requiring
these resources needed to free objects should have been allocated wit hthe
object long before it gets dereferenced and then garbage collected for
finalization. Finalization should not allocate memory except for very
simple objects that have NO such __gc finalization routine (such as strings
or simple preallocated buffers/indexed arrays).
Exactly. It seems to me the OP is concerned about temporary allocations
during the __gc metamethod not being cleaned up until the next GC cycle.
Well, welcome to the world of GC!
—Tim
Pierre Chapuis
2018-11-27 10:43:42 UTC
Permalink
Post by Soni L.
I am concerned about an attacker setting a __gc metamethod that loops
forever and can't be broken.
So this is more about debug hooks not running during `__gc` then?

This is a very real problem that has existed for a very long time [1].
I don't know another solution than not allowing untrusted users to
set `__gc`.
All sandboxes I know about (including those implemented in C)
that do anddon't do something very violent like spawning a thread and killing it
after sometime when unresponsive are somehow vulnerable to this.

[1] https://github.com/lua/lua/commit/6c79a0a80d517354dcc19a1ef64569fba9b19365#diff-8ea37271806c5efe3d7bbb83e67046d1R244
--
Pierre Chapuis
Soni L.
2018-11-27 10:49:32 UTC
Permalink
The another solution is a reentrant/recursive GC, as far as I know.
Post by Soni L.
I am concerned about an attacker setting a __gc metamethod that loops
forever and can't be broken.
So this is more about debug hooks not running during `__gc` then?
This is a very real problem that has existed for a very long time [1].
I don't know another solution than not allowing untrusted users to set
`__gc`.
All sandboxes I know about (including those implemented in C) that do and
don't do something very violent like spawning a thread and killing it
after some
time when unresponsive are somehow vulnerable to this.
[1]
https://github.com/lua/lua/commit/6c79a0a80d517354dcc19a1ef64569fba9b19365#diff-8ea37271806c5efe3d7bbb83e67046d1R244
--
Pierre Chapuis
Pierre Chapuis
2018-11-27 11:04:55 UTC
Permalink
Yes maybe, I meant with stock Lua.
Post by Soni L.
The another solution is a reentrant/recursive GC, as far as I know.
On Tue, Nov 27, 2018, 08:44 Pierre Chapuis
Post by Pierre Chapuis
Post by Soni L.
I am concerned about an attacker setting a __gc metamethod that
loops forever and can't be broken.>>
So this is more about debug hooks not running during `__gc` then?
This is a very real problem that has existed for a very long
time [1].>> I don't know another solution than not allowing untrusted users to
set `__gc`.>>
All sandboxes I know about (including those implemented in C)
that do and>> don't do something very violent like spawning a thread and killing it
after some>> time when unresponsive are somehow vulnerable to this.
[1] https://github.com/lua/lua/commit/6c79a0a80d517354dcc19a1ef64569fba9b19365#diff-8ea37271806c5efe3d7bbb83e67046d1R244>>
--
Pierre Chapuis
Roberto Ierusalimschy
2018-11-27 11:33:30 UTC
Permalink
Post by Soni L.
The another solution is a reentrant/recursive GC, as far as I know.
I still fail to see your point. The GC in Lua already is
"reentrant/recursive": you can freely call a garbage collection inside
a __gc metamethod. (Although, as others pointed out, this is smelly.)

And I do not see how this feature would solve the problem of
non-terminating finalizers set by bad actors.

Moreover, finalizers set inside a sandbox can get to run outside the
sandbox, so finalizers set by bad actors seem to have many other
problems besides non termination. The best solution here is, as
Pierre pointed out, not to allow setting __gc inside sandboxes.
(In general, both 'getmetatable' and 'setmetatable' should not be
available in sandboxes.)

-- Roberto
Soni L.
2018-11-27 12:02:05 UTC
Permalink
Post by Roberto Ierusalimschy
Post by Soni L.
The another solution is a reentrant/recursive GC, as far as I know.
I still fail to see your point. The GC in Lua already is
"reentrant/recursive": you can freely call a garbage collection inside
a __gc metamethod. (Although, as others pointed out, this is smelly.)
It disables debug hooks.
Post by Roberto Ierusalimschy
And I do not see how this feature would solve the problem of
non-terminating finalizers set by bad actors.
It would no longer disable debug hooks.
Post by Roberto Ierusalimschy
Moreover, finalizers set inside a sandbox can get to run outside the
sandbox, so finalizers set by bad actors seem to have many other
problems besides non termination. The best solution here is, as
Pierre pointed out, not to allow setting __gc inside sandboxes.
(In general, both 'getmetatable' and 'setmetatable' should not be
available in sandboxes.)
How would they run outside the sandbox?
Post by Roberto Ierusalimschy
-- Roberto
Roberto Ierusalimschy
2018-11-27 14:01:05 UTC
Permalink
Post by Soni L.
Post by Roberto Ierusalimschy
I still fail to see your point. The GC in Lua already is
"reentrant/recursive": you can freely call a garbage collection inside
a __gc metamethod. (Although, as others pointed out, this is smelly.)
It disables debug hooks.
The disabling of debug hooks has nothing to do with being reentrant (at
least for I understand as reentrant). This is mainly to avoid confusion
when debugging, but maybe it would be better to allow hooks during
finalizers. The implementation can work both ways, without other
modifications than the setting of "allowhook'.

(Similarly, a finalizer stops the GC only to avoid confusion, such
as another finalizer being called during the finalizer. There is
nothing in the code that demands those restrictions.)

-- Roberto
Soni L.
2018-11-27 15:09:51 UTC
Permalink
Hmm...

I think it would be quite good to allow debugging of finalizers by default.
A call/return hook can quite easily detect them and partially disable
itself if desired.

I've had to debug finalizers before, and it sucked, altho it's not as
annoying as the effects no-debug-gc has on sandboxing.
Post by Roberto Ierusalimschy
Post by Soni L.
Post by Roberto Ierusalimschy
I still fail to see your point. The GC in Lua already is
"reentrant/recursive": you can freely call a garbage collection inside
a __gc metamethod. (Although, as others pointed out, this is smelly.)
It disables debug hooks.
The disabling of debug hooks has nothing to do with being reentrant (at
least for I understand as reentrant). This is mainly to avoid confusion
when debugging, but maybe it would be better to allow hooks during
finalizers. The implementation can work both ways, without other
modifications than the setting of "allowhook'.
(Similarly, a finalizer stops the GC only to avoid confusion, such
as another finalizer being called during the finalizer. There is
nothing in the code that demands those restrictions.)
-- Roberto
pocomane
2018-11-27 11:33:12 UTC
Permalink
I am concerned about an attacker setting a __gc metamethod that loops forever and can't be broken.
How a recursive/renetrant GC can solve this issue? The first example I
can think of a "__gc metamethod that loops forever and can't be
broken" is:

```
local x = {}
setmetatable({},{__gc=fucntion()
while true do
x[#x+1] = {} -- Not necessary, but funny
end
end}
```
Philippe Verdy
2018-11-28 00:12:34 UTC
Permalink
This example is "blocking" all threads, but does not use recursion or
reentrance.

The real problematic example is this one:

local mt;
mt = {__gc=function(o) setmetatable(o,mt) end}
do
local x={}
setmetatable(x, mt)
end

After the "end", "x" should be finalized but will never be finalized and
we'll enter an infinite loop of gc cycles where "x" will be always present
in the finalization list, so it will remain resident as if it was static.
Now imagine the effect of:

while true do
local x={}
setmetatable(x, mt)
end

It will allocate an infinite number of new objects that will never be
finalized. This thread will exhaust rapidly all resources allowed by the OS
for the process, possibly allocating gigabytes or more that will never be
used and will constantly grow the finalization list. The CPU will be
running at 100%, with lot of paging to disk, causing severe problems to the
host OS if the Lua host process is not given hard limits using the OS API
(if not, the OS will finally hang and probably crash: this gives then a
successful DOS attack). But if we can force a __gc to not finalize the
object immediately by putting it in a "delay" list that will be marked as
noit finalizable and will remain reachable, we can limit the frequency for
finalization of these objects.

We still need a way to restrict the resources allocated by each
thread/coroutine (and that's why a suggest a new "__alloc" metamethod for
all Lua objects (and especially for threads created from a coordinating
parent thread), so they cannot exhaust what is allowed and cannot
freeze/crash the other threads or their host process or the whole host OS.
Post by Soni L.
Post by Soni L.
I am concerned about an attacker setting a __gc metamethod that loops
forever and can't be broken.
How a recursive/renetrant GC can solve this issue? The first example I
can think of a "__gc metamethod that loops forever and can't be
```
local x = {}
setmetatable({},{__gc=fucntion()
while true do
x[#x+1] = {} -- Not necessary, but funny
end
end}
```
Tim Hill
2018-11-28 01:37:02 UTC
Permalink
This example is "blocking" all threads, but does not use recursion or reentrance.
while true do
local x={}
setmetatable(x, mt)
end
It will allocate an infinite number of new objects that will never be finalized. This thread will exhaust rapidly all resources allowed by the OS for the process, possibly allocating gigabytes or more that will never be used and will constantly grow the finalization list. The CPU will be running at 100%, with lot of paging to disk, causing severe problems to the host OS if the Lua host process is not given hard limits using the OS API (if not, the OS will finally hang and probably crash: this gives then a successful DOS attack). But if we can force a __gc to not finalize the object immediately by putting it in a "delay" list that will be marked as noit finalizable and will remain reachable, we can limit the frequency for finalization of these objects.
t = {}
while true do
table.insert(t, “foo”)
end

So does this code. So do any number of similar run-away blocks of code. What is so special about your example that it should be prevented, when simple examples like mine can do the same thing?

—Tim
Philippe Verdy
2018-11-28 08:03:06 UTC
Permalink
There's a fundamental difference even if the effect looks the same: the
loop is not supposed to leak memory, but nothing is garbage collected. and
most CPU time will be used by the garbage collector trying to finalize more
and more objects without success, not in the Lua program itself but within
the Lua virtual machine itself becoming completely out of control and
probably crashing even before it can use the panic function.

This should not occur if there was a way to set safe limits to the
allocations a "thread" (in fact just a coroutine) can do: only the single
thread would run out of limit, would be affected by error() or nil objects
being allocated to them, the other threads would run unaffected. No panic
event would occur. And a parent thread could detect these situations, and
terminate only that offending thread and could take other preventive
measures (by being able to detect the origin of this thread).

This is something needed for Lua programs that run threads to implement a
web service, or simply to be able (like in Javascript) to use Lua as an
hypervisor to virtualize a complete OS: there's a need to be able to track
and restrict resource usages in each thread (this is also needed to protect
against Meltdown-like time-based attacks targetting the Lua's allocator or
garbage collector whose execution time becomes extremely high and thus
easily exploitable to create very effective "spying" side-channels).
Post by Philippe Verdy
Post by Philippe Verdy
This example is "blocking" all threads, but does not use recursion or
reentrance.
Post by Philippe Verdy
while true do
local x={}
setmetatable(x, mt)
end
It will allocate an infinite number of new objects that will never be
finalized. This thread will exhaust rapidly all resources allowed by the OS
for the process, possibly allocating gigabytes or more that will never be
used and will constantly grow the finalization list. The CPU will be
running at 100%, with lot of paging to disk, causing severe problems to the
host OS if the Lua host process is not given hard limits using the OS API
(if not, the OS will finally hang and probably crash: this gives then a
successful DOS attack). But if we can force a __gc to not finalize the
object immediately by putting it in a "delay" list that will be marked as
noit finalizable and will remain reachable, we can limit the frequency for
finalization of these objects.
t = {}
while true do
table.insert(t, “foo”)
end
So does this code. So do any number of similar run-away blocks of code.
What is so special about your example that it should be prevented, when
simple examples like mine can do the same thing?
—Tim
Francisco Olarte
2018-11-28 11:54:48 UTC
Permalink
There's a fundamental difference even if the effect looks the same: the loop is not supposed to leak memory, but nothing is garbage collected. and most CPU time will be used by the garbage collector trying to finalize more and more objects without success, not in the Lua program itself but within the Lua virtual machine itself becoming completely out of control and probably crashing even before it can use the panic function.
Au contraire. Although your bottom-quoting style hides the code in
question, it can be read as "lets mt be a metatable which forbids
garbage collection, enter an infinite loop setting the metatable of a
local table to said gc-forbiding table". It's clearly supposed to leak
memory, tha same as the table.insert example by tim ( although the
second one does not leak, it consumes memory but it is clearly
reachable and not leaked.

...
This is something needed for Lua programs that run threads to implement a web service,
NO, it is not. Lots of people run thread oriented web services without this.

Francisco Olarte.
Philippe Verdy
2018-11-28 12:19:14 UTC
Permalink
This is something needed for Lua programs that run threads to implement a web service,
NO, it is not. Lots of people run thread oriented web services without this.
It is not raisonnable at all to run any web service with threads for
servicing each concurrent request that can steal unlimited amounts of
resources needed by concurrent threads. You need a way to police them or at
least ensure that each thread will get an equitable share of resource, i.e.
equal share of time and memory, without also exceeding a maximum allowed
quantum so that each thread can be sure to also have a minimum share and
perform without failing always.

If the web service dies not do that, it is extremely easy to block with a
DOS attack. All reasonable requests should then succeed and other requests
exceeding their quota will be the only one to fail as expected.

We need limits always because absolutely no server has infinite amounts of
resources. So we always need a way to strictly enforce these quota.

With basic Lua threads, we have absolutely no control in any thread that
can deliberately use the allocator as much as they want, when they should
fail early without causing damages to other competing threads, including
hidden internal threads needed for the maintenance and monitoring of the
web service itself, or for enforcing the policy, and perform cleanup of
failing threads that have exhausted their allowed quota.
Philippe Verdy
2018-11-28 12:39:59 UTC
Permalink
So Lua alone is not sufficient: the pseudo threads created by
coroutine.create have unlimited time resource if they never yield (because
the routines are not preempted at all by any scheduler enforcing the time
quota) and they also have unlimited access to the same allocator (there's
no equivalent to the Luac function lua_setallocator, and in fact no way to
create such allocator in Lua, but we could still have the way to have a
__alloc metamethod in thread object types that would be used to measure the
memory resources requested by the thread in order to limit them by just
returning a Boolean authorizing or not the use of the global allocator.

For time resources, there's no solution with Lua coroutines if there's no
preemptive mechanism that would force them to yield after a maximum
timeslice quantum... For that purpose, we need TRUE threads plus a
scheduler, not just cooperating routines.

Si we can't do that in pure Lua with its existing standard library. We need
an extension to threads, or we need to use OS threads to instantiate as
many separate Lua engines, using a hos application written in C and using
the reentrant luac library...
Post by Philippe Verdy
This is something needed for Lua programs that run threads to implement a web service,
NO, it is not. Lots of people run thread oriented web services without this.
It is not raisonnable at all to run any web service with threads for
servicing each concurrent request that can steal unlimited amounts of
resources needed by concurrent threads. You need a way to police them or at
least ensure that each thread will get an equitable share of resource, i.e.
equal share of time and memory, without also exceeding a maximum allowed
quantum so that each thread can be sure to also have a minimum share and
perform without failing always.
If the web service dies not do that, it is extremely easy to block with a
DOS attack. All reasonable requests should then succeed and other requests
exceeding their quota will be the only one to fail as expected.
We need limits always because absolutely no server has infinite amounts of
resources. So we always need a way to strictly enforce these quota.
With basic Lua threads, we have absolutely no control in any thread that
can deliberately use the allocator as much as they want, when they should
fail early without causing damages to other competing threads, including
hidden internal threads needed for the maintenance and monitoring of the
web service itself, or for enforcing the policy, and perform cleanup of
failing threads that have exhausted their allowed quota.
Gé Weijers
2018-11-28 19:29:53 UTC
Permalink
Post by Philippe Verdy
So Lua alone is not sufficient: the pseudo threads created by
coroutine.create have unlimited time resource if they never yield (because
the routines are not preempted at all by any scheduler enforcing the time
quota) and they also have unlimited access to the same allocator (there's
no equivalent to the Luac function lua_setallocator, and in fact no way to
create such allocator in Lua, but we could still have the way to have a
__alloc metamethod in thread object types that would be used to measure the
memory resources requested by the thread in order to limit them by just
returning a Boolean authorizing or not the use of the global allocator
The Lua team designed an extension language that's powerful and small, and
you want it to be something else, a multithreaded language with advanced
scheduling and fine control over memory allocation, and because it's not
you complain it's lacking in that respect. That's like buying a small
roadster car and then complaining you can't haul building materials with it.

The coroutines in Lua solve a very useful problem: you can code
event-driven programs without having to resort to the callback hell you see
in Node.js. It's not meant to be used for single programs that can keep a
48 core server busy, although you can run multiple Lua states and figure
out a way to make them communicate (Lanes etc.)


Gé
Philippe Verdy
2018-11-29 14:05:03 UTC
Permalink
Post by Gé Weijers
Post by Philippe Verdy
So Lua alone is not sufficient: the pseudo threads created by
coroutine.create have unlimited time resource if they never yield (because
the routines are not preempted at all by any scheduler enforcing the time
quota) and they also have unlimited access to the same allocator (there's
no equivalent to the Luac function lua_setallocator, and in fact no way to
create such allocator in Lua, but we could still have the way to have a
__alloc metamethod in thread object types that would be used to measure the
memory resources requested by the thread in order to limit them by just
returning a Boolean authorizing or not the use of the global allocator
The Lua team designed an extension language that's powerful and small, and
you want it to be something else, a multithreaded language with advanced
scheduling and fine control over memory allocation, and because it's not
you complain it's lacking in that respect. That's like buying a small
roadster car and then complaining you can't haul building materials with it.
That's not what I requested. a full scheduler is not needed in the
language, but the language should allow building one entirely written in
Lua.

What is needed is a way to control in Lua programs (essentially in a parent
thread controling all the other threads it creates and monitors) the
resource usage and inform the VM which metered operations are allowed or
not, and the VM should provide such basic metrics: the timeslot which is
requested or the amount of memory requested or no longer in use. Nothing
more. Now we can design a parent thread to do whatever it wants to apply
its own policy for the children threads it creates and controls so none of
these children threads can abusely take all the resources needed by other
competing children threads (including the parent thread itself which can be
protected from being blocked by its children).
Milind Gupta
2018-11-29 22:11:08 UTC
Permalink
Post by Philippe Verdy
What is needed is a way to control in Lua programs (essentially in a
parent thread controling all the other threads it creates and monitors) the
resource usage and inform the VM which metered operations are allowed or
not, and the VM should provide such basic metrics: the timeslot which is
requested or the amount of memory requested or no longer in use. Nothing
more. Now we can design a parent thread to do whatever it wants to apply
its own policy for the children threads it creates and controls so none of
these children threads can abusely take all the resources needed by other
competing children threads (including the parent thread itself which can be
protected from being blocked by its children).
There is a Lua Library to do that: LuaStepper
<https://luarocks.org/modules/aryajur/luastepper> (
https://luarocks.org/modules/aryajur/luastepper*)*
Philippe Verdy
2018-11-29 22:53:58 UTC
Permalink
May be, but it depends on an external C module implementing the preemptive
meachanism (also its "tasks" are limited to scripts written as strings, so
they cannot work with Lua functions and closures without first marshalling
all parameters and its environment into the script (which will be
unmarshalled by the Task engine), and then marshall the results of the task
to string using some "print" statement that will be unmarshalled by parsing
the output with getTaskData().

So I think it will be quite slow (and not easy to integrate to communicate
complex Lua structures made with tables with possible cyclic references)

I think it uses a separate instance of the Lua engine, so these are not
"threads" sharing data with other threads and coroutines (they may
envtually share the same allocator and heap, but how to pass references to
objects in a Lua thread with objects in a Task or in another task ? You
also need extra communication channels I think.

What is needed is (I think) a native integration with threads in the same
engine instance to benefit of closures and avoid the extra (un)marshalling
using possibly very long strings.

The typical usage shown is extremely complex to start with, and I don't
understand the concept of "number of steps" in the "runLoop". We probably
don't need that or this dramatically reduces the possible usages to some
limited scriptable tasks that can run independantly of any context and jsut
take some input, run, then return some ouput, without communiating anything
with other tasks or normal Lua threads.
Post by Milind Gupta
Post by Philippe Verdy
What is needed is a way to control in Lua programs (essentially in a
parent thread controling all the other threads it creates and monitors) the
resource usage and inform the VM which metered operations are allowed or
not, and the VM should provide such basic metrics: the timeslot which is
requested or the amount of memory requested or no longer in use. Nothing
more. Now we can design a parent thread to do whatever it wants to apply
its own policy for the children threads it creates and controls so none of
these children threads can abusely take all the resources needed by other
competing children threads (including the parent thread itself which can be
protected from being blocked by its children).
There is a Lua Library to do that: LuaStepper
<https://luarocks.org/modules/aryajur/luastepper> (
https://luarocks.org/modules/aryajur/luastepper*)*
Milind Gupta
2018-11-29 23:22:24 UTC
Permalink
The C module just uses the Lua API and does not do anything OS specific.
Sorry I do not quite understand the other remarks about having tasks
only as strings. Every Lua program is a string to start with.
Run loop number of steps is just the number of opcodes you want to
execute before returning control to the parent program.
Communication between tasks is limited to passing tables and strings
and not fancy shared objects. Recursive tables however should not be a
problem. Communication right now will be managed by the parent task.



On Nov 29, 2018 2:54 PM, "Philippe Verdy" <***@wanadoo.fr> wrote:

May be, but it depends on an external C module implementing the preemptive
meachanism (also its "tasks" are limited to scripts written as strings, so
they cannot work with Lua functions and closures without first marshalling
all parameters and its environment into the script (which will be
unmarshalled by the Task engine), and then marshall the results of the task
to string using some "print" statement that will be unmarshalled by parsing
the output with getTaskData().

So I think it will be quite slow (and not easy to integrate to communicate
complex Lua structures made with tables with possible cyclic references)

I think it uses a separate instance of the Lua engine, so these are not
"threads" sharing data with other threads and coroutines (they may
envtually share the same allocator and heap, but how to pass references to
objects in a Lua thread with objects in a Task or in another task ? You
also need extra communication channels I think.

What is needed is (I think) a native integration with threads in the same
engine instance to benefit of closures and avoid the extra (un)marshalling
using possibly very long strings.

The typical usage shown is extremely complex to start with, and I don't
understand the concept of "number of steps" in the "runLoop". We probably
don't need that or this dramatically reduces the possible usages to some
limited scriptable tasks that can run independantly of any context and jsut
take some input, run, then return some ouput, without communiating anything
with other tasks or normal Lua threads.
Post by Milind Gupta
Post by Philippe Verdy
What is needed is a way to control in Lua programs (essentially in a
parent thread controling all the other threads it creates and monitors) the
resource usage and inform the VM which metered operations are allowed or
not, and the VM should provide such basic metrics: the timeslot which is
requested or the amount of memory requested or no longer in use. Nothing
more. Now we can design a parent thread to do whatever it wants to apply
its own policy for the children threads it creates and controls so none of
these children threads can abusely take all the resources needed by other
competing children threads (including the parent thread itself which can be
protected from being blocked by its children).
There is a Lua Library to do that: LuaStepper
<https://luarocks.org/modules/aryajur/luastepper> (
https://luarocks.org/modules/aryajur/luastepper*)*
Philippe Verdy
2018-11-30 13:12:18 UTC
Permalink
You don't understand: Lua programs are parsed only once then executed or
natively compield from the bytecode only, strings are no longer used except
for loading the program the first time, but never when instanciating
objects written in that language. There's no marshalling/unmarshaling
between Lua functions calls, references are working natively (via indexes
in tables stored in the heap or via closures in the stack, it works
directly at binary level, and there's never any need to marshall/unmarshall
complex data structures or serialize them by building and parsing strings
all the time).

I'm not convinced this is the right approach, even when using the LuaC
libary, there's a better way to do that without this huge overhead (which
also requires lot of heap allocations for these temporary scripts
containing everything: the complete source code of all functions the task
needs to do, or the code that will use the loader, and all the data they'll
work with and that they'll finally return).
We can do that with LuaC by using existing Lua "thread" objects, to which
we can attach metadata for adding info about how the threads can operate
and can be scheduled.
The LuaC library is needs only to allow creating new threads with theior
own Allocator and setup the metadata attached to threads and possibly some
"userdata" object to tune the Allocator behavior.

You need however small changes in the Lua bytecode interpret so that it can
test a flag indicating that there's a pending interrupt for which the
bytecode should stop interpreting continuously, but should insert a call to
yield() before continuing with the next instruction. Finally your C library
will be used to create the OS-level timer that will just asynchrnously set
the interrupt flag, without modifying directyly the state of any running
Lua thread. You also need a way for threads to handle some critical
sections using some basic test-and-set instruction (this can be implemetned
as an unary boolean operator, similar in syntax to "not" except that it
must occur before arbitrary expressions but only before a variable name,
like with the "local" instruction; this operator can use a new reserved
keyword like "testset" or a new lexical token like "?!"; the variable will
bet set to a boolean or nil, it does not need a new type, but it will
compile a new bytecode instruction which will be atomic and just needs one
memory operand or a register index in the Lua closure's window on the stack)

If we don't define this additional bytecode, there's no possibility to
synchronize concurrent threads that can preempt the exuecution while
working on shared objects, leaving them in inconsistant state (this limits
a lot the possility for real multithreading when working with shared data
structures, or when performing I/O, so I'm convinced we need some mutex
mechanism). Adding thys additional bytecode is a trivial modification to
the VM.

However we don't need a pure scheduler with relative priorities between
threads. All that is needed is to allow threads to run really concurrently
without having them to be modified so they include many "yield()" calls
every where in their code or within all their loops or before/after each
blocking I/O.

If Lua is really reentrant as stated in its documentation, then even its
blocking I/O libary would be interruptible and reentrant, so that we can
force them to yield() in Lua instead of blocking inside the OS thread with
its own internal yield to give hand to other processes and OS-native
threads (but may be this requires modifying a bit the standard I/O library
to force them to yield() in Lua instead of blocking in the native C
standard I/O library, or in an OS API used by the native C standard I/O
libary.
Post by Milind Gupta
The C module just uses the Lua API and does not do anything OS specific.
Sorry I do not quite understand the other remarks about having tasks
only as strings. Every Lua program is a string to start with.
Run loop number of steps is just the number of opcodes you want to
execute before returning control to the parent program.
Communication between tasks is limited to passing tables and strings
and not fancy shared objects. Recursive tables however should not be a
problem. Communication right now will be managed by the parent task.
May be, but it depends on an external C module implementing the preemptive
meachanism (also its "tasks" are limited to scripts written as strings, so
they cannot work with Lua functions and closures without first marshalling
all parameters and its environment into the script (which will be
unmarshalled by the Task engine), and then marshall the results of the task
to string using some "print" statement that will be unmarshalled by parsing
the output with getTaskData().
So I think it will be quite slow (and not easy to integrate to communicate
complex Lua structures made with tables with possible cyclic references)
I think it uses a separate instance of the Lua engine, so these are not
"threads" sharing data with other threads and coroutines (they may
envtually share the same allocator and heap, but how to pass references to
objects in a Lua thread with objects in a Task or in another task ? You
also need extra communication channels I think.
What is needed is (I think) a native integration with threads in the same
engine instance to benefit of closures and avoid the extra (un)marshalling
using possibly very long strings.
The typical usage shown is extremely complex to start with, and I don't
understand the concept of "number of steps" in the "runLoop". We probably
don't need that or this dramatically reduces the possible usages to some
limited scriptable tasks that can run independantly of any context and jsut
take some input, run, then return some ouput, without communiating anything
with other tasks or normal Lua threads.
Post by Milind Gupta
Post by Philippe Verdy
What is needed is a way to control in Lua programs (essentially in a
parent thread controling all the other threads it creates and monitors) the
resource usage and inform the VM which metered operations are allowed or
not, and the VM should provide such basic metrics: the timeslot which is
requested or the amount of memory requested or no longer in use. Nothing
more. Now we can design a parent thread to do whatever it wants to apply
its own policy for the children threads it creates and controls so none of
these children threads can abusely take all the resources needed by other
competing children threads (including the parent thread itself which can be
protected from being blocked by its children).
There is a Lua Library to do that: LuaStepper
<https://luarocks.org/modules/aryajur/luastepper> (
https://luarocks.org/modules/aryajur/luastepper*)*
Tim Hill
2018-12-01 18:58:51 UTC
Permalink
So I think it will be quite slow (and not easy to integrate to communicate complex Lua structures made with tables with possible cyclic references)
We handled that with out own high-performance table pack/unpack logic, including arbitrary graphs of tables, string interning and support for closure passing as well. Developed a pretty efficient algorithm (in C) which ended up giving us all the performance we needed. In fact, we ended up using the packed format (which was a compact binary form) as a network wire format AND a persistence format, so every part of the system saved/exchanged data in one well-defined format.

—Tim

Tim Hill
2018-11-28 21:22:39 UTC
Permalink
Si we can't do that in pure Lua with its existing standard library. We need an extension to threads, or we need to use OS threads to instantiate as many separate Lua engines, using a hos application written in C and using the reentrant luac library...
Which is exactly what we did on our project .. each Lua VM was event driven from an event queue and each event call was monitored for time via debug hooks and memory usage via a custom allocator. We then built a dispatcher using an OS thread pool to drive the VMs. The fact that we could do all that without a single patch to the Lua source was a testament to its clean design.

And imho that is the point of Lua .. it’s a clean efficient building block you can mold to your needs, not an imposing edifice that must be used in a particular manner.

—Tim
Pierre Chapuis
2018-11-29 09:08:31 UTC
Permalink
Post by Tim Hill
Which is exactly what we did on our project .. each Lua VM was event
driven from an event queue and each event call was monitored for time
via debug hooks and memory usage via a custom allocator. We then built a
dispatcher using an OS thread pool to drive the VMs. The fact that we
could do all that without a single patch to the Lua source was a
testament to its clean design.
Almost the same here. Sadly we cannot open source it.
they can spawn Lua "processes" that run in separate Lua
threads, can respond to events and communicate through
some kind of "IPC".
Post by Tim Hill
From the point of view of the implementer, the execution
interface is abstract, so you can run the processes on top of
an OS thread pool, in an event loop, etc.
Post by Tim Hill
ps
PID STATE REF MEM MAX_MEM TIME NAME
1 RUNNING 2 515.8KiB 636.3KiB 166.7ms init
Post by Tim Hill
spawn diagnose
PID=9
Post by Tim Hill
ps
PID STATE REF MEM MAX_MEM TIME NAME
1 RUNNING 2 530.6KiB 636.3KiB 209.8ms init
9 IDLE 1 723.5KiB 748.7KiB 10.0us diagnose
Post by Tim Hill
kill 9
ps
PID STATE REF MEM MAX_MEM TIME NAME
1 RUNNING 2 544.3KiB 636.3KiB 214.8ms init

The entire library is a mere 3000 lines of C and 500 lines of Lua,
and it runs on Mac, Linux, Windows, Android and iOS.

I wonder how many companies have re-built basically this,
with all the same tricks to interact with C code, event loops...
--
Pierre Chapuis
Tim McCracken
2018-11-29 17:04:57 UTC
Permalink
I wonder how many companies have re-built basically this, with all the same tricks to interact with C code, event loops...
I suspect a lot! I am currently working on a similar system. I am following a "soft RTOS" model: A single OS process will handle multiple "jobs", include a "global common" for shared data, and an embedded real-time(ish) database. Each "job" will include a Lua interpreter, prioritized event queue, recurring event scheduler, and a few other features and additional/alternative standard libraries. Each "task" will be a Lua function. And lots of value-added functionality - both generalized as well as application specific, which may vary based on the specific product.

I use the Lua interpreter "out of the box" stock, although I have added some additional libraries. I have been very happy with Lua. I took the time to understand it's limitations, which in reality are very few since it can be easily extended through the API. IMO, It is much easier to extend and embed than either TCL or Python. It is a great platform to add lots of value-added C/C++ code and give users the ability to programmatically control functionality rather than just having access to configuration variables and tables - which is the norm for all of the systems I commonly work with. But I don't expect (or want) Lua to offer features that are part of the OS, or a 3GL library. What I really want from Lua is a structured, simple, scripting environment for end users to customize as required - any Lua is outstanding at that.

There are (of course) language features I would like, many of which get discussed on this list regularly. But, in my case, they can all be accomplished with a few more keystrokes. *** For me the number one concern is the stability of the Lua language. *** I am more than willing to adapt to, and take advantage of, changes in the C API with each new release. But in my ideal world, the Lua scripts that customers may develop need to work from one release to the next. So going forward, I really hope we ___rarely___ see changes that are not backward compatible.
Sadly we cannot open source it.
The products I am working on will not be open source, but I am planning to have a fully functional (but limited in size) system for hobbyist/evaluation use at zero
pocomane
2018-11-28 07:27:54 UTC
Permalink
This example is "blocking" all threads, but does not use recursion or reentrance.
This is EXACTLY my point. In other words, I wanted to highlight that
the issue pointed out by the OP (at least as far a I can undestand)
can arise regardless of recursion or reentrance.

By the way, Roberto already replayed with a clearer post :)
Loading...