Discussion:
Strange memory allocation behavior
Luke Horgan
2018-09-14 16:35:49 UTC
Permalink
Hello Lua friends,

We're using Lua to facilitate Ethereum-style smart contracts in a
C/C++ based cryptocurrency toolkit. In case you aren't familiar with
smart contracts, the idea is pretty simple. It's just a little
program that all the peers in a P2P network run. Every peer needs to
agree on the program's output, so the scripts need to be completely
deterministic. Also, any peer can submit a smart contract, so the
scripts run in a sandboxed environment with constraints on instruction
count and memory usage. Before running a contract script, we disable
the garbage collector with 'collectgarbage("stop")'. Hypothetically,
this shouldn't be a problem, since we only want scripts to be able to
allocate a fixed amount of memory, total, over the entire course of
their execution. However, it seems to cause the memory profile of
even trivial scripts to blow up for no particularly apparent reason.

We are tracking allocations using a custom lua_Alloc function, as
documented in the reference manual (version 5.3). Of particular
confusion is that functions which are never called still appear to
consume large amounts of memory (hundreds of kilobytes). This problem
does seem to be severely exacerbated by the use of custom variables we
load from the sandbox environment, so I can only imagine that's part
of the problem, but the behavior is confusing regardless. Why should
an uncalled function consume so much memory, or any memory at all? Is
there any reason lua_Alloc might behave strangely with the garbage
collector disabled?

I can find no obvious fault with my own code, so I thought it sensible
to check that there isn't some well known Lua behavior I'm unaware of.
Thanks for your help.

Best,
~Luke
Roberto Ierusalimschy
2018-09-14 19:10:56 UTC
Permalink
Post by Luke Horgan
We're using Lua to facilitate Ethereum-style smart contracts in a
C/C++ based cryptocurrency toolkit. In case you aren't familiar with
smart contracts, the idea is pretty simple. It's just a little
program that all the peers in a P2P network run. Every peer needs to
agree on the program's output, so the scripts need to be completely
deterministic. Also, any peer can submit a smart contract, so the
scripts run in a sandboxed environment with constraints on instruction
count and memory usage. Before running a contract script, we disable
the garbage collector with 'collectgarbage("stop")'. Hypothetically,
this shouldn't be a problem, since we only want scripts to be able to
allocate a fixed amount of memory, total, over the entire course of
their execution. However, it seems to cause the memory profile of
even trivial scripts to blow up for no particularly apparent reason.
We are tracking allocations using a custom lua_Alloc function, as
documented in the reference manual (version 5.3). Of particular
confusion is that functions which are never called still appear to
consume large amounts of memory (hundreds of kilobytes). This problem
does seem to be severely exacerbated by the use of custom variables we
load from the sandbox environment, so I can only imagine that's part
of the problem, but the behavior is confusing regardless. Why should
an uncalled function consume so much memory, or any memory at all? Is
there any reason lua_Alloc might behave strangely with the garbage
collector disabled?
I can find no obvious fault with my own code, so I thought it sensible
to check that there isn't some well known Lua behavior I'm unaware of.
Thanks for your help.
'collectgarbage("stop")' only stops the collector. It should not
change in any other way the behavior of the program or the calls
to lua_Alloc.

-- Roberto
云风
2018-09-15 00:44:30 UTC
Permalink
Post by Luke Horgan
Hypothetically,
this shouldn't be a problem, since we only want scripts to be able to
allocate a fixed amount of memory, total, over the entire course of
their execution. However, it seems to cause the memory profile of
even trivial scripts to blow up for no particularly apparent reason.
Creating new objects (table , string, closure, coroutine, etc) and Inserting key into table may need more memory.

If you need a fixed amount of memory , I think you can define a lua_Alloc and set the memory limit. You can let lua_Alloc returns NULL when exceed the memory limit, and lua will collect immediately.
Tom Sutcliffe
2018-09-17 11:28:03 UTC
Permalink
Post by Luke Horgan
Before running a contract script, we disable
the garbage collector with 'collectgarbage("stop")'. Hypothetically,
this shouldn't be a problem, since we only want scripts to be able to
allocate a fixed amount of memory, total, over the entire course of
their execution. However, it seems to cause the memory profile of
even trivial scripts to blow up for no particularly apparent reason.
We are tracking allocations using a custom lua_Alloc function, as
documented in the reference manual (version 5.3). Of particular
confusion is that functions which are never called still appear to
consume large amounts of memory (hundreds of kilobytes). This problem
does seem to be severely exacerbated by the use of custom variables we
load from the sandbox environment, so I can only imagine that's part
of the problem, but the behavior is confusing regardless. Why should
an uncalled function consume so much memory, or any memory at all? Is
there any reason lua_Alloc might behave strangely with the garbage
collector disabled?
I doubt it is anything specifically to do with lua_Alloc or uncalled functions, it might just be you're seeing more of the internals of how Lua works. Every scrap of memory Lua uses internally goes through the alloc function, not just the memory used for the variables and data structures your script declares, and there can be quite a short-lived objects created.

For example, are you parsing Lua source code while the collector is stopped? If so then any temporary memory needed by the parser's internal data structures will not be freed up either.

Also, parsing Lua source itself requires memory to store the bytecode implementation of the functions. So the more stuff you parse, the more memory it'll consume even if you never call it. As a _very_ rough rule of thumb, the bytecode tends to take up slightly more memory than the original source code (although I don't remember any concrete figures).

I'm not sure I understand the perceived value of running the contract logic with the garbage collector disabled, the language by its nature uses the allocator a _lot_ behind the scenes, and running for prolonged periods without the collector ain't gonna be pretty. The really nice feature of an incremental collector is how the penalty of collection is spread out so there aren't any unpleasant stop-and-halt delays. The only time I've ever considered disabling a garbage collector was in languages which didn't have an incremental collector and I was doing something time-critical for a short period. I would never bother in a Lua script - generally the only time I ever call collectgarbage() is to force collection of native objects which need _gc-ing, or for debugging, or when I'm not going to use the lua_State for a while and want to minimise its memory footprint.

If you want to impose a maximum allocation limit on the scripts, I'd suggest you allow the GC to run as normal, but keep track of the currently-allocated total in your lua_Alloc function rather than the cumulative sum of all allocations. As 云风 mentioned, returning NULL from the allocator generally triggers a full garbage collection cycle anyway which gives the runtime a chance to GC stuff and get back under budget again, so there isn't a huge amount you need to do to make it work.

I've had some fun in the past running instruction count and memory constrained Lua interpreters, so it's definitely something the language can handle, providing your native functions aren't too much of a CPU-usage multiplier, as the hooks can't really help with "while true do reallySlowNativeFunction() end". Just let the garbage collector do its thing :-)

Cheers,

Tom
Luke Horgan
2018-09-17 17:16:00 UTC
Permalink
Hello all,

Thank you so much for the information you've provided! Tom, your
answer in particular has answered many questions I had.

I have not disabled the garbage collector for performance reasons.
The issue (or part of it) is that the contract scripts are supposed to
be strictly deterministic, and I wasn't sure if there was any
randomness or unpredictability in the way the garbage collector
chooses when to run, or what to collect. Every script has to have
exactly the same output on every peer's machine. You can imagine a
scenario where someone deliberately writes a script that creeps up on
the memory limit. If the garbage collector isn't totally consistent,
then maybe it swoops in and frees memory on one user's machine before
it has a chance to fail, while on another user's machine it doesn't.
If I keep a running tally of total allocated memory, will that number
be consistent on every run, on every machine? I suppose I could
experiment a bit to find out, although it would be nice to gain some
insight into how the collector actually works behind the scenes.

Best,
~Luke
Post by Tom Sutcliffe
Post by Luke Horgan
Before running a contract script, we disable
the garbage collector with 'collectgarbage("stop")'. Hypothetically,
this shouldn't be a problem, since we only want scripts to be able to
allocate a fixed amount of memory, total, over the entire course of
their execution. However, it seems to cause the memory profile of
even trivial scripts to blow up for no particularly apparent reason.
We are tracking allocations using a custom lua_Alloc function, as
documented in the reference manual (version 5.3). Of particular
confusion is that functions which are never called still appear to
consume large amounts of memory (hundreds of kilobytes). This problem
does seem to be severely exacerbated by the use of custom variables we
load from the sandbox environment, so I can only imagine that's part
of the problem, but the behavior is confusing regardless. Why should
an uncalled function consume so much memory, or any memory at all? Is
there any reason lua_Alloc might behave strangely with the garbage
collector disabled?
I doubt it is anything specifically to do with lua_Alloc or uncalled functions, it might just be you're seeing more of the internals of how Lua works. Every scrap of memory Lua uses internally goes through the alloc function, not just the memory used for the variables and data structures your script declares, and there can be quite a short-lived objects created.
For example, are you parsing Lua source code while the collector is stopped? If so then any temporary memory needed by the parser's internal data structures will not be freed up either.
Also, parsing Lua source itself requires memory to store the bytecode implementation of the functions. So the more stuff you parse, the more memory it'll consume even if you never call it. As a _very_ rough rule of thumb, the bytecode tends to take up slightly more memory than the original source code (although I don't remember any concrete figures).
I'm not sure I understand the perceived value of running the contract logic with the garbage collector disabled, the language by its nature uses the allocator a _lot_ behind the scenes, and running for prolonged periods without the collector ain't gonna be pretty. The really nice feature of an incremental collector is how the penalty of collection is spread out so there aren't any unpleasant stop-and-halt delays. The only time I've ever considered disabling a garbage collector was in languages which didn't have an incremental collector and I was doing something time-critical for a short period. I would never bother in a Lua script - generally the only time I ever call collectgarbage() is to force collection of native objects which need _gc-ing, or for debugging, or when I'm not going to use the lua_State for a while and want to minimise its memory footprint.
If you want to impose a maximum allocation limit on the scripts, I'd suggest you allow the GC to run as normal, but keep track of the currently-allocated total in your lua_Alloc function rather than the cumulative sum of all allocations. As 云风 mentioned, returning NULL from the allocator generally triggers a full garbage collection cycle anyway which gives the runtime a chance to GC stuff and get back under budget again, so there isn't a huge amount you need to do to make it work.
I've had some fun in the past running instruction count and memory constrained Lua interpreters, so it's definitely something the language can handle, providing your native functions aren't too much of a CPU-usage multiplier, as the hooks can't really help with "while true do reallySlowNativeFunction() end". Just let the garbage collector do its thing :-)
Cheers,
Tom
--
Luke Horgan
Candidate for BS in Computer Science and Computer Engineering
College of Engineering | Northeastern University
luke-horgan.com | 908-577-7902
Tom Sutcliffe
2018-09-23 16:55:33 UTC
Permalink
Post by Luke Horgan
Hello all,
Thank you so much for the information you've provided! Tom, your
answer in particular has answered many questions I had.
I have not disabled the garbage collector for performance reasons.
The issue (or part of it) is that the contract scripts are supposed to
be strictly deterministic, and I wasn't sure if there was any
randomness or unpredictability in the way the garbage collector
chooses when to run, or what to collect. Every script has to have
exactly the same output on every peer's machine.
If the GC step and mul parameters are set the same, then the GC will behave _mostly_ identically on any machine running the exact same Lua binary (in particular 32- and 64-bit builds will be different even when still on x86 - different architectures will have different sizes for data structures which will affect collection logic). You would have to control for anything discernibly different about the environment that scripts can detect - any API that could return a different-sized result on different machines (even down to eg localisation of date formats, if those APIs are exposed). And I'm assuming your subset of the language locks down most things that could have side-effects

I was thinking you might have to change l_randomizePivot to the recommended non-random ~0 implementation, but looking again at the code, it seems to only be used when sorting tables which I don't think is a problem since that doesn't allocate. You may wish to do that anyway to make sorting fully deterministic in time as well as space domains.
Post by Luke Horgan
You can imagine a
scenario where someone deliberately writes a script that creeps up on
the memory limit. If the garbage collector isn't totally consistent,
then maybe it swoops in and frees memory on one user's machine before
it has a chance to fail, while on another user's machine it doesn't.
Whether this matters or not depends on whether you have a strict requirement on timing being absolutely exact or not (which might be a tricky thing to control for full stop, when running on a preemptive multithreading OS). If the timing doesn't have to be precisely exact, then the situation you describe (there not being quite enough freed by the collector at the point of allocation) will result in Lua initiating a full collection cycle and retrying the allocation (providing your allocator returns NULL when overbudget, rather than say halting the script immediately). Obviously this will affect timing but should be (mostly) invisible to the script. Meaning that assuming there's enough space left in the budget, it shouldn't matter whether or not the GC is fully up to date with its housekeeping or not. I'm assuming also that the memory limit you're imposing is sufficiently small compared the physical capabilities of the machine that fragmentation won't be an issue.
Post by Luke Horgan
If I keep a running tally of total allocated memory, will that number
be consistent on every run, on every machine? I suppose I could
experiment a bit to find out, although it would be nice to gain some
insight into how the collector actually works behind the scenes.
I think that's an interesting question in of itself :-) In theory, assuming you've controlled for everything I've described (and whatever else I failed to think about) then I think the answer would be yes, for a program not relying on undefined behaviour. Tracking that number over time across multiple runs sounds like it would be a useful way to see how things are behaving. But as I mentioned, it may not in fact matter to control the environment this strictly, if you're only concerned about memory budget, given the script will only get an "out of memory" error if there isn't any memory available *even after* collection.

Thinking about it some more, there are two aspects to this:

1) Ensuring the Lua environment uses exactly the same amount of memory when running a given script, regardless of the state of the machine it's running on, so as to guarantee that the memory limit imposed on the script is enforced consistently on all machines. This is all the controlling for external factors stuff, making sure there aren't any APIs exposed to Lua (or called by it internally as a result of anything the scripts can do) which can return different-sized results. Then, there's:

2) Ensuring entirely deterministic program execution in all scenarios, including code which depends on things which Lua says are implementation-defined. Because the GC can only be fully deterministic if the entire program flow is also deterministic. This is a much harder problem, as all the SPECTRE-style side-channel vulnerabilities have shown recently when applied to CPU hidden state. I could write a program involving a custom __gc metamethod which behaves differently depending on exactly when the GC ran. Equally I could write something which produced different output depending on what pointers the underlying allocator returned, because various things are hashed using their pointer value, and what they hash as affects the order that a pairs()-style table iteration would return them in (or, trivially, looking at the value returned by `tostring(function() end)` and doing something different depending on the value). Either of those two things could allow a badly-behaved program to behave differently across different runs, but only by exploiting things which Lua says are implementation-defined and that you shouldn't rely on. Note in particular, disabling the GC entirely doesn't prevent exploitation of the second of those two things. The question then is do you want to try and control for those too, with eg custom allocator trickery to hide the differences of the underlying malloc implementation, or a custom pairs() implementation which sorts all the keys or something.

So if you're only concerned about (1), leaving the GC running and taking precautions to control for everything affecting memory usage, is probably quite tractable. If you want to go the whole hog and try and tackle (2), that might be considerably more work. Obfuscating information leaks (like the pointer value of a function via tostring()), maybe banning custom __gc metamethods, or redefining pairs, or defining a fully custom allocator (rather than just a wrapper that enforces budget then calls through), there are a lot of things you might want to consider.

Regards,

Tom
Gé Weijers
2018-09-25 22:31:48 UTC
Permalink
Post by Tom Sutcliffe
If the GC step and mul parameters are set the same, then the GC will
behave _mostly_ identically on any machine running the exact same Lua
binary (in particular 32- and 64-bit builds will be different even when
still on x86 - different architectures will have different sizes for data
structures which will affect collection logic). You would have to control
for anything discernibly different about the environment that scripts can
detect - any API that could return a different-sized result on different
machines (even down to eg localisation of date formats, if those APIs are
exposed). And I'm assuming your subset of the language locks down most
things that could have side-effects
The interpreter uses a randomized 'seed' for the hash, to make it harder to
perform attacks using hash collisions. If you run this script multiple
times the results are printed in a different order.

local T = {}
for i = 1, 10 do
T[("X%d"):format(i)] = 1
end
for k in pairs(T) do
print(k)
end

Another problem: you can 'observe' the garbage collector at work by using
the __gc metamethod and/or weak tables, and a script's result can easily
depend on whether a key or value is present in a weak table or whether the
function referenced by __gc executes.

Out of the box Lua is quite non-deterministic 🀔
--
--
Gé
Luke Horgan
2018-09-26 18:50:19 UTC
Permalink
Thank you, Tom and Gé, for your thoughtful responses. They have given me
lots to think about!

In practice, I am struggling to conjure a scenario that produces different
results on each run. Banning __gc metamethods, as Tom suggests, is no
trouble, and I've already gone ahead and redefined tostring to prevent it
from leaking memory address information. Without those two things, how
else can I actually get non-deterministic behavior? On all the machines
I've tried so far, for instance, the script Gé provided appears to produce
identical output on every run. I understand the Lua specification doesn't
guarantee that behavior, but it would be handy to find a simple example
that actually produces reliably non-deterministic results (or at least
results that change with each executin). If anyone has any ideas about how
to do this, or has done it before, I'd really appreciate your input.

Best,
~Luke
Post by Gé Weijers
Post by Tom Sutcliffe
If the GC step and mul parameters are set the same, then the GC will
behave _mostly_ identically on any machine running the exact same Lua
binary (in particular 32- and 64-bit builds will be different even when
still on x86 - different architectures will have different sizes for data
structures which will affect collection logic). You would have to control
for anything discernibly different about the environment that scripts can
detect - any API that could return a different-sized result on different
machines (even down to eg localisation of date formats, if those APIs are
exposed). And I'm assuming your subset of the language locks down most
things that could have side-effects
The interpreter uses a randomized 'seed' for the hash, to make it harder
to perform attacks using hash collisions. If you run this script multiple
times the results are printed in a different order.
local T = {}
for i = 1, 10 do
T[("X%d"):format(i)] = 1
end
for k in pairs(T) do
print(k)
end
Another problem: you can 'observe' the garbage collector at work by using
the __gc metamethod and/or weak tables, and a script's result can easily
depend on whether a key or value is present in a weak table or whether the
function referenced by __gc executes.
Out of the box Lua is quite non-deterministic 🀔
--
--
Gé
--
*Luke Horgan*
*Candidate for BS in Computer Science and Computer Engineering*
*College of Engineering | Northeastern University*
*luke-horgan.com <http://luke-horgan.com> | 908-577-7902*
Gé Weijers
2018-09-26 23:42:00 UTC
Permalink
On all the machines I've tried so far, for instance, the script Gé
provided appears to produce identical output on every run. I understand
the Lua specification doesn't guarantee that behavior, but it would be
handy to find a simple example that actually produces reliably
non-deterministic results (or at least results that change with each
executin). If anyone has any ideas about how to do this, or has done it
before, I'd really appreciate your input.
On my Macbook Pro I'm seeing this: (Lua 5.3.4)

$ lua d.lua
X7
X8
X9
X4
X3
X6
X5
X1
X2
X10
$ lua d.lua
X3
X4
X1
X2
X10
X9
X6
X5
X8
X7

Look at the routine 'makeseed' in ltable.c. When a new lua_State is created
makeseed returns a 'random' value that's used in hash algorithms to make
them less predictable.



--
Gé

Loading...