Discussion:
proposal: extend table constructors
Philippe Verdy
2018-11-04 15:44:52 UTC
Permalink
I propose to extend te syntax of table constructors like this:

{ -- starts a new local context
(closure)
[: ref0 = 'k1'] = : ref1 = v1, -- ['k1'] = v1 and set
external ref0 = 'k1'
[: ref2] = v2, -- [1] = v2 and set local
ref2 = 1
[:local ref3] = v3, -- [2] = v3 and set local
ref3 = 2
[:local ref4 = 'k2'] = :local ref5 = v4, -- ['k2'] = v4 and set local
ref4 = 'k2', local ref5 = v4
[:local ref6 = 'k3'] = : ref7 = v5, -- ['k3'] = v5 and set local
ref6 = 'k2', external ref7 = v5
k4 = ref0, -- ['k4'] = 'k1'
[ 'k5'] = ref2, -- ['k5'] = v2
ref1, -- [3] = v1
} -- close the local context

The idea is to allow constructors to use cyclic references by allowing to
define "tracking" variables keeping the key (eventually the implicit
numeric key) and the value assigned to a key; these variables can be
defined in local scope (the scope is a closure, the local context of the
table constructor itself) or in external scope (any enclosing scope,
eventually outside any table constructor, or in another enclosing table
constructor).

The syntax for keys is not changed, it still uses the [] to enclose an
expression.

However the expression allowed in a key (between []) or in a value (after
the = if there's a key specified before) is extented by allowing one of
these form to set variables:
:local variable, (only for key expressions), or
:local variable = expression, or
:variable, (only for key expressions), or
:variable = expression, or
expression, or

* The first two forms define the variable in the current local scope (it
creates a new variable in the local context, like with function closures,
or local contexts of enuneration variables in "for", hiding any external
variable defined with the same name).
* The last two forms can use any lvalue expression: if the external context
does not have this variable defined, it will create the variable in the
external context, otherwise it will replace its current value.

This syntax extension is simple to parse (no complex look-ahead or
backtracking), it just starts with a leading ":" before the local or
external variable, the '=' sign (optional if there's no expression and
permitted only in the key part, required in the value part) and the
expression (optional only in the key part because it implies an implicit
numeric key)

You may prefer to change the syntax distinguishing the assignment of local
and external variables (":local variable" versus ":variable"). For example:
- ":variable" for the a local variable (must be a "name" or
"[expression]"), or
- "?variable" for setting an external variable (which can be any lvalue
expression)

In all cases, the locality is limited to the scope of the table
constructor, it is permitted to reassign the same variable in the same
local or external scope, like in standard Lua assignment instructions, or
in standard declarations of local variables in functional blocks.

No limit is set on the types of values that these variables can hold (only
the nil value is invalid as the key of a table entry, but not invalid as
the value of a table entry).

This allows to entirely serialize any table (including cyclic/recursive
ones) in pure data-only format (the data-only format is independant of the
context where it is used if all variables set inside the main table
constructor are defined locally in the main table constructor (but
external, non-local, variables can still be used in a sub-table).

It is even possible to serialize tables containing code (i.e. function
blocks), with their defined lamda expressions, possibly even coroutines
(only functions that are defined as external C-functions cannot be
serialized with their code, they can only be referenced by their binding
name; C functions and coroutines are most probably excluded from pure
data-only table constructors, but lambda functions may be permitted if the
code inside them does not any use external references to variables not in
local scope of the main serialized table and does not use external
C-functions or some restricted functions of the Lua library; some functions
of the Lua library may still be allowed, including pcall(), that can
locally handle exceptions)
Dirk Laurie
2018-11-04 19:27:47 UTC
Permalink
{ -- starts a new local context (closure)
[: ref0 = 'k1'] = : ref1 = v1, -- ['k1'] = v1 and set external ref0 = 'k1'
[: ref2] = v2, -- [1] = v2 and set local ref2 = 1
[:local ref3] = v3, -- [2] = v3 and set local ref3 = 2
[:local ref4 = 'k2'] = :local ref5 = v4, -- ['k2'] = v4 and set local ref4 = 'k2', local ref5 = v4
[:local ref6 = 'k3'] = : ref7 = v5, -- ['k3'] = v5 and set local ref6 = 'k2', external ref7 = v5
k4 = ref0, -- ['k4'] = 'k1'
[ 'k5'] = ref2, -- ['k5'] = v2
ref1, -- [3] = v1
} -- close the local context
Why is this better than writing it out in plain Lua that everyome can
already understand?
Coda Highland
2018-11-04 21:06:55 UTC
Permalink
Post by Philippe Verdy
{ -- starts a new local
context (closure)
Post by Philippe Verdy
[: ref0 = 'k1'] = : ref1 = v1, -- ['k1'] = v1 and set
external ref0 = 'k1'
Post by Philippe Verdy
[: ref2] = v2, -- [1] = v2 and set
local ref2 = 1
Post by Philippe Verdy
[:local ref3] = v3, -- [2] = v3 and set
local ref3 = 2
Post by Philippe Verdy
[:local ref4 = 'k2'] = :local ref5 = v4, -- ['k2'] = v4 and set
local ref4 = 'k2', local ref5 = v4
Post by Philippe Verdy
[:local ref6 = 'k3'] = : ref7 = v5, -- ['k3'] = v5 and set
local ref6 = 'k2', external ref7 = v5
Post by Philippe Verdy
k4 = ref0, -- ['k4'] = 'k1'
[ 'k5'] = ref2, -- ['k5'] = v2
ref1, -- [3] = v1
} -- close the local context
Why is this better than writing it out in plain Lua that everyome can
already understand?
I understand the intent is that it can express internal references
(possibly circular) in a single literal instead of having to write the
plain object and then imperatively link in the references using subsequent
statements.

This could be implemented by a preprocessor that translates it into an IIFE
that does exactly that.

/s/ Adam
Philippe Verdy
2018-11-04 22:06:57 UTC
Permalink
The Lua spec uses an imperative form which is unnecessarily verbose (it
requires rewriting the table name and all indexing keys for tables and
subtables. This syntax in the Lua specs does not even allow using the
generated "text" with local semantics to create larger structures
containing or referencing the same data.
What I suggest is something that can allow safe data-only structures
(including any graph) to be serialized completely It just uses the powerful
concept of closures to apply scoping rules to the existing "{}" notation,
then adds a way to mark some keys or values in tables (because both can
create circular links) with a reference than can be used anywhere in the
data structure to reconstruct it exactly, and even embed the structure into
another one without scoping problems.
Theses "variables" are even more powerful than id="" attributes in HTML
(which are in a single namespace): you don't need any complex renaming
scheme. Everything is already in Lua: while parsing the "{}" not only a new
table is being generated, but another table is used to contain the
variables in the same scope, this internal table of variables, created when
you encounter the leading "{' is discarded/garbage collected when you reach
the closing "}"; this internal table of variable also keeps the track (for
example in its associated metatable) of the scopes of new subtables created
inside.
The implementation of this is easy to do in Lua itself, but it could as
well be part of its native syntax. It could also also facilititate
debugging (e.g. print(table1) can dump all its content without just
printing "table" and without falling into an infinite loop of
self-recursions, using the same tracking technic used in the Lua specs that
unfortunately generates a lengthy multiline output with many assignments
and lot of redundancy)
Post by Coda Highland
Post by Philippe Verdy
{ -- starts a new local
context (closure)
Post by Philippe Verdy
[: ref0 = 'k1'] = : ref1 = v1, -- ['k1'] = v1 and set
external ref0 = 'k1'
Post by Philippe Verdy
[: ref2] = v2, -- [1] = v2 and set
local ref2 = 1
Post by Philippe Verdy
[:local ref3] = v3, -- [2] = v3 and set
local ref3 = 2
Post by Philippe Verdy
[:local ref4 = 'k2'] = :local ref5 = v4, -- ['k2'] = v4 and set
local ref4 = 'k2', local ref5 = v4
Post by Philippe Verdy
[:local ref6 = 'k3'] = : ref7 = v5, -- ['k3'] = v5 and set
local ref6 = 'k2', external ref7 = v5
Post by Philippe Verdy
k4 = ref0, -- ['k4'] = 'k1'
[ 'k5'] = ref2, -- ['k5'] = v2
ref1, -- [3] = v1
} -- close the local context
Why is this better than writing it out in plain Lua that everyome can
already understand?
I understand the intent is that it can express internal references
(possibly circular) in a single literal instead of having to write the
plain object and then imperatively link in the references using subsequent
statements.
This could be implemented by a preprocessor that translates it into an
IIFE that does exactly that.
/s/ Adam
Philippe Verdy
2018-11-04 22:52:31 UTC
Permalink
Note also that the same extended syntax can be used to serialize as well
the metatable: just consider that the metatable is the value assigned to
the (reserved) nil key !
So:
{nil={}}
builds an empty table and associates its metatable to another empty table.
The metatable can also reference the main table itself as in:
{:local t={nil=t}},
which creates a table containing (at the implied index [1]) an empty table
(locally named "t") whose metatable (indicated by "nil=t") is this same
table "t".
Then you can drop the outer table using the key [1] where it was implicitly
stored as a value:
{:local t={nil=t}}[1]
returns just the table t itself with its metatable set to itself.

If you don't like the idea of using "nil" to reference the metatable,
another alternative is to use ":" just after the table closure to specify
the metatable:
So:
{}:{}
is the equivalent to the first solution, and
{:local t={}:t},
is the equivalent of the second solution, and
{:local t={}:t}[1]
is the equivalent to the third solution.

Note that the (local) "t" variable is instanciated immediately (and set
immediately in the internal table of scopes used during the construction)
to reference the given value in construction (if this value is a table, its
internal keys have not been enumerated and assigned a value, and as well
its metatable is still nil, until we assign its value with the ":" that
immediately follows the "}" and which can use directly the value kept in
that variable "t".

If you prefer not having to use the "local" keyword and want local scopes
to be the default, external scopes using another syntax, the second
solution can also be written as:
{:t={nil=t}},
or without using the "nil" pseudo key for the metatable:
{:t={}:t},
The third solution as well can be written as:
{:t={nil=t}}[1]
or
{:t={}:t}[1]

For assigning variables with external scopes, e.g. an table contains an
inner table (which internally defines a table "b"), and the table "b",
something like:
x = {[:b], {[:*b]={}}, b}[1]
* the scope of "b" is the outer table in which it is defined locally (but
does not generate any key/value pair in the outer table in construction),
but then it is assigned the reference to the innermost table.
* what you get from this expression is a table x such that x[2] = {}, and
x[1] = {x[2]}.
* there's NOTHING in the table x (or its metatable which is nil here) that
still contains the name "b" used only during its construction.
Post by Coda Highland
Post by Philippe Verdy
{ -- starts a new local
context (closure)
Post by Philippe Verdy
[: ref0 = 'k1'] = : ref1 = v1, -- ['k1'] = v1 and set
external ref0 = 'k1'
Post by Philippe Verdy
[: ref2] = v2, -- [1] = v2 and set
local ref2 = 1
Post by Philippe Verdy
[:local ref3] = v3, -- [2] = v3 and set
local ref3 = 2
Post by Philippe Verdy
[:local ref4 = 'k2'] = :local ref5 = v4, -- ['k2'] = v4 and set
local ref4 = 'k2', local ref5 = v4
Post by Philippe Verdy
[:local ref6 = 'k3'] = : ref7 = v5, -- ['k3'] = v5 and set
local ref6 = 'k2', external ref7 = v5
Post by Philippe Verdy
k4 = ref0, -- ['k4'] = 'k1'
[ 'k5'] = ref2, -- ['k5'] = v2
ref1, -- [3] = v1
} -- close the local context
Why is this better than writing it out in plain Lua that everyome can
already understand?
I understand the intent is that it can express internal references
(possibly circular) in a single literal instead of having to write the
plain object and then imperatively link in the references using subsequent
statements.
This could be implemented by a preprocessor that translates it into an
IIFE that does exactly that.
/s/ Adam
Philippe Verdy
2018-11-04 23:15:22 UTC
Permalink
Post by Philippe Verdy
For assigning variables with external scopes, e.g. an table contains an
inner table (which internally defines a table "b"), and the table "b",
x = {[:b], {[:*b]={}}, b}[1]
* the scope of "b" is the outer table in which it is defined locally (but
does not generate any key/value pair in the outer table in construction),
but then it is assigned the reference to the innermost table.
* what you get from this expression is a table x such that x[2] = {}, and
x[1] = {x[2]}.
* there's NOTHING in the table x (or its metatable which is nil here) that
still contains the name "b" used only during its construction.
Sorry I made several typos in this example sent too soon (missing brace and
incorrect use of [] for what is not a key but a value, I meant:
x = { :b, { {::b={3,4,5} }, b } }[1]
note that the external scope is explicited by the two colons "::", we can
imagine using three colons or more to count the enclosing scopes, or using
":*" to refer directly to the global scope (where the variable is set
outside the table constructor itself, as a normal variable which makes
things simpler (no need to isolate the "b" in a temporary enclosing table
to restrict its scope, and then drop this temporary table by indexing it
with "[1]").
b = 1;
x = { { :*b = {3,4,5} }, b }
and also changes the value in variable "b" by replacing it with the
constructed value is accessible after the construction. So the value of
variable "b" is no longer 1, but the innermost table {3,4,5},
and we have x[1][1] === b and x[2] === b
Philippe Verdy
2018-11-04 23:32:29 UTC
Permalink
I can add also that,because metatables are normal tables, they can also
have their own metatables, so this would make sense:
{}:{}:{}
to create a empty table whose metatable is a second empty table whose
metatable is a third empty table whose metatable is nil.

And instead of the code:
t = {1,2,3}; setmetatable(t, {4,5,6})
you can write directly and much more simply:
t = {1,2,3}:{4,5,6}
Post by Philippe Verdy
Post by Philippe Verdy
For assigning variables with external scopes, e.g. an table contains an
inner table (which internally defines a table "b"), and the table "b",
x = {[:b], {[:*b]={}}, b}[1]
* the scope of "b" is the outer table in which it is defined locally (but
does not generate any key/value pair in the outer table in construction),
but then it is assigned the reference to the innermost table.
* what you get from this expression is a table x such that x[2] = {}, and
x[1] = {x[2]}.
* there's NOTHING in the table x (or its metatable which is nil here)
that still contains the name "b" used only during its construction.
Sorry I made several typos in this example sent too soon (missing brace
x = { :b, { {::b={3,4,5} }, b } }[1]
note that the external scope is explicited by the two colons "::", we can
imagine using three colons or more to count the enclosing scopes, or using
":*" to refer directly to the global scope (where the variable is set
outside the table constructor itself, as a normal variable which makes
things simpler (no need to isolate the "b" in a temporary enclosing table
to restrict its scope, and then drop this temporary table by indexing it
with "[1]").
b = 1;
x = { { :*b = {3,4,5} }, b }
and also changes the value in variable "b" by replacing it with the
constructed value is accessible after the construction. So the value of
variable "b" is no longer 1, but the innermost table {3,4,5},
and we have x[1][1] === b and x[2] === b
Andrew Starks
2018-11-04 21:24:25 UTC
Permalink
On November 4, 2018 at 9:45:16 AM, Philippe Verdy (***@wanadoo.fr)
wrote:

I propose to extend te syntax of table constructors like this:

{ -- starts a new local context
(closure)
[: ref0 = 'k1'] = : ref1 = v1, -- ['k1'] = v1 and set
external ref0 = 'k1'
[: ref2] = v2, -- [1] = v2 and set local
ref2 = 1
[:local ref3] = v3, -- [2] = v3 and set local
ref3 = 2
[:local ref4 = 'k2'] = :local ref5 = v4, -- ['k2'] = v4 and set local
ref4 = 'k2', local ref5 = v4
[:local ref6 = 'k3'] = : ref7 = v5, -- ['k3'] = v5 and set local
ref6 = 'k2', external ref7 = v5
k4 = ref0, -- ['k4'] = 'k1'
[ 'k5'] = ref2, -- ['k5'] = v2
ref1, -- [3] = v1
} -- close the local context

The idea is to allow constructors to use cyclic references by allowing to
define "tracking" variables keeping the key (eventually the implicit
numeric key) and the value assigned to a key; these variables can be
defined in local scope (the scope is a closure, the local context of the
table constructor itself) or in external scope (any enclosing scope,
eventually outside any table constructor, or in another enclosing table
constructor).

The syntax for keys is not changed, it still uses the [] to enclose an
expression.

However the expression allowed in a key (between []) or in a value (after
the = if there's a key specified before) is extented by allowing one of
these form to set variables:
:local variable, (only for key expressions), or
:local variable = expression, or
:variable, (only for key expressions), or
:variable = expression, or
expression, or

* The first two forms define the variable in the current local scope (it
creates a new variable in the local context, like with function closures,
or local contexts of enuneration variables in "for", hiding any external
variable defined with the same name).
* The last two forms can use any lvalue expression: if the external context
does not have this variable defined, it will create the variable in the
external context, otherwise it will replace its current value.

This syntax extension is simple to parse (no complex look-ahead or
backtracking), it just starts with a leading ":" before the local or
external variable, the '=' sign (optional if there's no expression and
permitted only in the key part, required in the value part) and the
expression (optional only in the key part because it implies an implicit
numeric key)

You may prefer to change the syntax distinguishing the assignment of local
and external variables (":local variable" versus ":variable"). For example:
- ":variable" for the a local variable (must be a "name" or
"[expression]"), or
- "?variable" for setting an external variable (which can be any lvalue
expression)

In all cases, the locality is limited to the scope of the table
constructor, it is permitted to reassign the same variable in the same
local or external scope, like in standard Lua assignment instructions, or
in standard declarations of local variables in functional blocks.

No limit is set on the types of values that these variables can hold (only
the nil value is invalid as the key of a table entry, but not invalid as
the value of a table entry).

This allows to entirely serialize any table (including cyclic/recursive
ones) in pure data-only format (the data-only format is independant of the
context where it is used if all variables set inside the main table
constructor are defined locally in the main table constructor (but
external, non-local, variables can still be used in a sub-table).

It is even possible to serialize tables containing code (i.e. function
blocks), with their defined lamda expressions, possibly even coroutines
(only functions that are defined as external C-functions cannot be
serialized with their code, they can only be referenced by their binding
name; C functions and coroutines are most probably excluded from pure
data-only table constructors, but lambda functions may be permitted if the
code inside them does not any use external references to variables not in
local scope of the main serialized table and does not use external
C-functions or some restricted functions of the Lua library; some functions
of the Lua library may still be allowed, including pcall(), that can
locally handle exceptions)



I think you can use ltokenp to implement this / try it out.

http://lua-users.org/lists/lua-l/2016-05/msg00028.html
Loading...