compiler and handled by the underlying VM).
"lexical" scope is non-sense, it is backed by a effective structure.
create these objects.
type (in C only, not in Lua itself where it is always inaccessible).
don't modify or add any table entry. Look at how the bytecodes are
if the value in them are the same type (e.g. number or string).
Closures = overcomplicating it.
Post by Philippe VerdyAsk yourself what are LUA_TCLOSURE objects (yes in the heap...)
and see how they are correlated with the static "prototypes"
generated by the compiler which instructs how to bind caller's
variables to callee's upvalues. The actual values are on the
stack but at unknown position from the callee, so a mapping
LUA_TCLOSURE object is allocated on the heap.
Ask yourself why these closure objects exist: this is because
functions may be recursive (and not necessarily by trailing
calls), so the variables in closures may refer not to the
immediate parent caller frame but to some ancestor frame at
arbitrary number of levels in the calls stack). To avoid every
access to the closure varaible to pass through a chain, there's a
need for a mapping which is created and initialized before each
call to rebind each variable for the next inner call (according
to the closure's prototype): such object is allocated only if
there are external variables used by the function that are not
local to the function itself, they are not directly within the
stack window.
This explains also because the bytecode needs separate opcodes
for accessing registers and upvalues: if they could be directly
on the stack, it would be enough to reference upvalues with
negative register indexes and then use the stack as a sliding
window, like we do in C/C++ call conventions (except that stack
indexes in C/C++ are negative for parameters, and positive for
local variables, some of the former being cached in actual
registers, but still supported by "shadow" variables on the
stack, allocated at fixed positions in the stack fram by the
compiler or just pushed before the actual parameters and popped
to restore these registers after the call).
There's been performance tests that show that closures are not so
fast, they can create massive amounts of garbage collected
objects (with internal type LUA_TCLOSURE). I think this behavior
very curious, and the current implementation that allocates the
LUA_TCLOSURE objects on the heap is not the best option, the
mapping could be allocated directly in the stack of local
variables/registers of the caller (and all these closure objects
used by the caller could be merged to a single one, i.e. as the
largest closure object needed for inner calls, merged like in an
union. The closure objects themselves to not hold any variable
value, these are just simple mappings from a small fixed integer
sets (between 1 and the number of upvalues of the called
function) and variable integers (absolute indexes in the thread's
stack where the actual variable is located).
The byte code is not as optimized as it could be: the register
numbers are only positive, the upvalue numbers are also only
positive, they could forml a single set (positive integers for
local registers, negative integers for upvalues, meaning that
they are used to index the entries in the closure object to get
access to the actual varaible located anywhere on the stack,
outside of the immediate parent frame). The generated bytecode is
not as optimal as it could be because various operations can only
work on registers or constants (like ADD r1,r2,r3) so temporary
registers must be allocated by the compiler (let's remember that
the number of registers is limited). As well Lua's default engine
treat all registers the same, when most of them will work with a
single r0 register (an "accumulator") which could be implicit for
most instructions, and this would reduce the instruction sizes
(currently 32-bit or 64 bits), which is inefficient as it uses
too much the CPU L1 data cache.
I'm convinced that the current approach in the existing Lua VM
engine and its internal instruction set can be largely improved
for better performance, without really changing the language
itself, to get better data locality (smaller instruction sizes,
less wasted unused bit fields), and elimination of uses of the
heap for closures (to dramatically reduce the stress on the
garbage collector)
Op Sa. 1 Des. 2018 om 17:08 het Philippe Verdy
Post by Philippe VerdyIf there's something to do to the Lua engine is to rework
the way upvalues are allocated: they should be on the call
stack as much as possible (e.g. for up to ~250 upvalues) and
not on the heap.
The way I understand it, upvalues are not on the heap. They are on the
main execution stack below the bottom of the current
function's stack
frame.