Discussion:
[BUG] table will be garbage collected multiple times
Gregor Burghard
2018-11-18 08:51:31 UTC
Permalink
Hello everyone,

here is a bug for Lua >= 5.3.0:

When the finalizer method of a table resets the metatable of the same
table, it will not be deleted after finalization. That means the table
still exists and will be garbage collected again.

The following code demonstrates the bug.

https://pastebin.com/LUbnueut
Francisco Olarte
2018-11-18 11:07:11 UTC
Permalink
Post by Gregor Burghard
When the finalizer method of a table resets the metatable of the same
table, it will not be deleted after finalization. That means the table
still exists and will be garbage collected again.
A more knowledgeable person may be give better details, but given
https://www.lua.org/manual/5.3/manual.html#2.5.1 says

<<You mark an object for finalization when you set its metatable and
the metatable has a field indexed by the string "__gc". >>

and a bit later..

<<Moreover, if the finalizer marks a finalizing object for
finalization again, its finalizer will be called again in the next
cycle where the object is unreachable>>

Seems like documented / expected behaviour to me ( a bit weird, but I
assume there is a reason, I can think of a couple of them ).

Francisco Olarte.
Philippe Verdy
2018-11-18 14:34:13 UTC
Permalink
I agree, the example shown marks the same object for finalization in the
next cycle by setting again its metatable with the MT table that has a
declared __gc key mapped to the function; when it is called (the object is
being finalized), the finalizer here unconditionnally marks the object for
being finalized in a later cycle. So the finalizer will be called
indefinitely, once at each cycle.


1. function MT:__gc()
2. self.cnt = self.cnt + 1
3. if self.cnt == 1 then
4. print("finalizing...")
5. else
6. print("and again...")
7. end
8. setmetatable(self, MT) -- here the error occurs
9. end

This should be:

1. function MT:__gc()
2. self.cnt = self.cnt + 1
3. if self.cnt == 1 then
4. print("finalizing...")
5. setmetatable(self, MT) -- mark for later finalization
6. else
7. print("and again...")
8. end
9. end

This way only the first finalization (self.cnt==1) will displaying
"finalizing..." and mark the object to be finalized again later (by
restoring its metatable); the next cycle one will cause the finalizer to be
called again but then self.cnt will be 2 and you'll get the message "and
again...", but now the object is not marked to be finalized again, so it
will be effectively collected (you should no longer see the "and again..."
message more than once after the single "finalizing..." message).

It's not very well documented, but when a finalizer gets called on an
object, just before calling it, the GC first clears the associated
metatable if the object being finalized is a table: in the finalizer for an
object whose type is 'table' or 'userdata', if you use getmetatable(self),
it's not documented clearly if either you'll get nil, or you'll get the
same metatable whose "__gc" entry is now nill, something that should be
better, allowing you to store the "cnt" variable inside the metatable
itself along with the "__gc" variable, instead of the object being
finalized).
Post by Francisco Olarte
Post by Gregor Burghard
When the finalizer method of a table resets the metatable of the same
table, it will not be deleted after finalization. That means the table
still exists and will be garbage collected again.
A more knowledgeable person may be give better details, but given
https://www.lua.org/manual/5.3/manual.html#2.5.1 says
<<You mark an object for finalization when you set its metatable and
the metatable has a field indexed by the string "__gc". >>
and a bit later..
<<Moreover, if the finalizer marks a finalizing object for
finalization again, its finalizer will be called again in the next
cycle where the object is unreachable>>
Seems like documented / expected behaviour to me ( a bit weird, but I
assume there is a reason, I can think of a couple of them ).
Francisco Olarte.
Muh Muhten
2018-11-18 23:43:15 UTC
Permalink
You might also (ab)use this to trigger bookkeeping tasks (once per GC
cycle), if you have no better way to do that. (A fixed "every $n
invocations of a function" scheme might not work (it could fire _both_
too rarely and too often, at different times), and in certain restricted
situations (games etc.), this might be as good as it gets… but note that
this is slightly racy – _any_ allocation can trigger a GC cycle, so
protect your data structures / make sure you're not reading inconsistent
state when triggered in the middle of some change.)
A workaround for this can be found in classical signal-handling
techniques: the GC action can set a flag to be checked elsewhere to
determine whether or not the bookkeeping needs to be done. Of course,
whether GC cycles are the right thing to count is a different
matter...
Andrew Gierth
2018-11-19 01:24:23 UTC
Permalink
Philippe> It's not very well documented, but when a finalizer gets
Philippe> called on an object, just before calling it, the GC first
Philippe> clears the associated metatable if the object being finalized
Philippe> is a table: in the finalizer for an object whose type is
Philippe> 'table' or 'userdata', if you use getmetatable(self), it's
Philippe> not documented clearly if either you'll get nil, or you'll
Philippe> get the same metatable whose "__gc" entry is now nill,
Philippe> something that should be better, allowing you to store the
Philippe> "cnt" variable inside the metatable itself along with the
Philippe> "__gc" variable, instead of the object being finalized).

I really don't know where you're getting this stuff, but it's all wrong.

The metatable of an object isn't changed in any way by garbage
collection, unless the __gc method chooses to do that itself (which is
sometimes a good idea, especially for userdatas where allowing method
calls on a post-finalization object(*) may be unwise). Inside the __gc
function for an object (whether table or userdata), getmetatable(self)
has the same value it had just before the object was collected, and
getmetatable(self).__gc is the __gc method being executed. Nor is the
metatable changed after the __gc method completes.

(*) - references to post-finalization objects can be easily obtained via
the keys of ephemeron tables
--
Andrew.
Philippe Verdy
2018-11-19 05:12:55 UTC
Permalink
Post by Andrew Gierth
Inside the __gc
function for an object (whether table or userdata), getmetatable(self)
has the same value it had just before the object was collected, and
getmetatable(self).__gc is the __gc method being executed. Nor is the
metatable changed after the __gc method completes.
Nothing forbids the getmetatable(self) to return the metatable of the
object it had before it was collected: self is referencing the closure in
which the finalizer was created and is independant of the object itself (a
finalizer function may be created and set in a (meta)table long before the
object is created and the m(eta)table is associated with the object by
using setmetatable(object, (meta)table);

Self is not the same as the object (o) passed in the 1st parameter of the
finalizer, whose metatable may still have been cleared.

As well nothing forbids getmetatable(o) to return the effective metatable
of the object (o) before it was modified by the GC: the GC prepares an
environment for calling the finalizer, in which the getmetatable function
will be found that will still be able to return that value that the GC has
kept.

But in my opinion all this is unnecessarily complicate. It would be much
simpler if the finalizer indicated to the GC that the object must not be
swept, by just returning a non-nil value. If the finalizer does not thing,
or reaches the end of its code without using a return instruction, or if it
uses "return nil", the effect is the same: the first upvalue returned will
be nil and the object must then be swept.
If the finalizer just "return true", it clearly indicates that the object
must be kept; the GC does not have to create a specific environment, the
finalizer does not have to inspect the object state on return, does not
have to track the usage of setmetatable() by the finalizer.
The GC will take its decision to sweep or keep the object only by looking
at the first upvalue returned by the finalizer. This is much clearer! And
the GC does not even have to change any internal state of the object before
calling the finalizer, so this is also more efficient.
Finalizers could also return other interesting status for the object to
keep (e.g. indicating not just the fact that it must be kept, but also, for
example returning if it should be kept in the active generation or placed
in an older generation (to be finalized later, but much less urgently: e.g.
if it returns false it is kept in the current generation, if it returns
true, it is kept in the older generation, for Lua implementations that
support generations in their GC; the non-nil return value is then a hint
given to the GC about what to do with the preserved object, because that
object will be finalized again and again, at every GC-cycle, if the
finalizer constantly returns a non-nil value when it is called ! The hint
can be used to reduce the frequency of calls to this finalizer, which is
acting like a coroutine running most often at unpredictable times, but
indefinitely without ever really terminating as long as it returns a
non-nil value or does any other action indicating to the GC that the dead
object must be kept).
Gé Weijers
2018-11-19 18:54:10 UTC
Permalink
You'd think that Lua source code is unavailable if you read this thread.
The finalizer implementation is fairly simple to understand:

- if you call 'setmetatable' on an object, and the metatable has a __gc
field the object is moved from the 'allgc' list to the 'finobj' list. The
FINALIZEDBIT is set in the object's GC state.
- when the object becomes unreachable it is moved back to the 'allgc'
list, and the FINALIZEDBIT is reset. The finalizer is then called.
- when the object becomes unreachable again (i.e. when it's on the
'allgc' list and there are no references to it) it's freed.

If you call 'setmetatable' in the finalizer, and that metatable has a
'__gc' field the object is moved back to the 'finobj' list, so the process
starts again and the object is never freed.

Observations:

- an object is only finalized once, unless you explicitly call
'setmetatable' again to set the finalizer.
- it's not a bug, the original poster's program explicitly requests that
the object be finalized again using 'setmetatable'. The behavior is
consistent with the documentation.
- there are other things you can do in the __gc function, like storing a
reference to the object somewhere so it stays reachable. The finalizer has
run, but the object will have to be kept around anyway because it just
became reachable again. Having the finalizer 'decide' what to do is
therefor unsafe. Freeing an object whose finalizer has not run yet requires
two GC cycles, at the end of the first one the finalizer runs, at the end
of the second one the object is freed.
- It's not clear to me what the use case is for changing the metatable
in a finalizer to begin with. In the normal case you'd expect the object to
be unreachable after the finalizer returns, so it will be freed in the next
pass.

"It's not a bug, it's a feature!"

--
Gé
Andrew Gierth
2018-11-19 20:19:41 UTC
Permalink
Gé> - It's not clear to me what the use case is for changing the
Gé> metatable in a finalizer to begin with. In the normal case you'd
Gé> expect the object to be unreachable after the finalizer returns, so
Gé> it will be freed in the next pass.

When dealing with userdatas in particular, it can be useful to reset the
metatable in the finalizer to nil or to a dummy metatable, because there
is a specific case where the object is _NOT_ unreachable after the
finalizer returns: if it's a key in an ephemeron table. (The ephemeron
table entry is not removed until the object is collected on the next
pass; so it's important that either the __gc method for a userdata
leaves the object data in a state that won't crash in the event of
further access to the object, or that it removes or replaces the
metatable.)
--
Andrew.
Gé Weijers
2018-11-19 23:06:47 UTC
Permalink
Post by Andrew Gierth
When dealing with userdatas in particular, it can be useful to reset the
metatable in the finalizer to nil or to a dummy metatable, because there
is a specific case where the object is _NOT_ unreachable after the
finalizer returns [...]
I just figured that out, so it does make sense to remove the metatable if
your finalizer leaves your object in an unsafe state otherwise. I typically
have an explicit method that releases the resources used by a userdata
object (object:close() or similar), so I don't particularly care about
accesses via an ephemeron table.

Thanks,
--
--
Gé
Philippe Verdy
2018-11-20 17:42:24 UTC
Permalink
So in summary, it is the effectively fact that we call "setmetatable(o,mt)"
on the object (o) passed in parameter to the finalizer that effectively
changes its state (a bit in the object).
In my opinion, this is not the best way to handle it
- this requires a specific behavior of setmetatable: it inspects the
metatatable to see if there's a __gc function in it, then sets the
FINALIZEDBIT if it is so, then returns to the finalizer, which may still
change the metatable after this.
- the only safe behavior would be that the finalizer **returns** an
effective status. For now finalizer are functions that return nothing (or
nil), they could return a non-nil value (probably non-false as well, e.g.
true) to indicate its desire to indicate that the FINALIZEDBIT must be
CLEARED on the object. A finalizer that does nothing (an empty function)
would not return anything, so the GC can still safely look at the metadata
at this time, see if it has a __gc function, and if not it will set the
FINALIZEDBIT (allowing the object to be swept and freed).
Post by Gé Weijers
You'd think that Lua source code is unavailable if you read this thread.
- if you call 'setmetatable' on an object, and the metatable has a
__gc field the object is moved from the 'allgc' list to the 'finobj' list.
The FINALIZEDBIT is set in the object's GC state.
- when the object becomes unreachable it is moved back to the 'allgc'
list, and the FINALIZEDBIT is reset. The finalizer is then called.
- when the object becomes unreachable again (i.e. when it's on the
'allgc' list and there are no references to it) it's freed.
If you call 'setmetatable' in the finalizer, and that metatable has a
'__gc' field the object is moved back to the 'finobj' list, so the process
starts again and the object is never freed.
- an object is only finalized once, unless you explicitly call
'setmetatable' again to set the finalizer.
- it's not a bug, the original poster's program explicitly requests
that the object be finalized again using 'setmetatable'. The behavior is
consistent with the documentation.
- there are other things you can do in the __gc function, like storing
a reference to the object somewhere so it stays reachable. The finalizer
has run, but the object will have to be kept around anyway because it just
became reachable again. Having the finalizer 'decide' what to do is
therefor unsafe. Freeing an object whose finalizer has not run yet requires
two GC cycles, at the end of the first one the finalizer runs, at the end
of the second one the object is freed.
- It's not clear to me what the use case is for changing the metatable
in a finalizer to begin with. In the normal case you'd expect the object to
be unreachable after the finalizer returns, so it will be freed in the next
pass.
"It's not a bug, it's a feature!"
--
Gé
Philippe Verdy
2018-11-20 17:55:35 UTC
Permalink
This proposed behavior would have no impact oin finalizers:
- they can still use getmetatable(o) and see the unmodiied metatable of the
object (no need then for the GC to temporarily set clear it or set it to a
dummy (different) metatable.
- they can use setmetatable(o,mt) as they want: no quirk needed in the
implementation of setmetatable (whic hthen does not need to care about the
fact a finalization of the object is pending)
- all is determined ONLY when the finalizer returns.

A simple finalizer like:
setmetatable(o, {__gc =function() return true end})
or even just:
setmetatable(o, __gc =true)
(if we also honor a boolean value of __gc as equivalent to a function
returning a boolean) would be enough to say that the object "o" must NEVER
be finalized. (the alternative using a boolean would avoid the need to
perform any costly calls in repeated attempts to finalize the object, such
object would not even be in a finalization list, the GC would then
automatically consider the object as "marked", and reachable, the object
will never be freed, i.e. will remain permanent in memory)
Post by Philippe Verdy
So in summary, it is the effectively fact that we call
"setmetatable(o,mt)" on the object (o) passed in parameter to the finalizer
that effectively changes its state (a bit in the object).
In my opinion, this is not the best way to handle it
- this requires a specific behavior of setmetatable: it inspects the
metatatable to see if there's a __gc function in it, then sets the
FINALIZEDBIT if it is so, then returns to the finalizer, which may still
change the metatable after this.
- the only safe behavior would be that the finalizer **returns** an
effective status. For now finalizer are functions that return nothing (or
nil), they could return a non-nil value (probably non-false as well, e.g.
true) to indicate its desire to indicate that the FINALIZEDBIT must be
CLEARED on the object. A finalizer that does nothing (an empty function)
would not return anything, so the GC can still safely look at the metadata
at this time, see if it has a __gc function, and if not it will set the
FINALIZEDBIT (allowing the object to be swept and freed).
Post by Gé Weijers
You'd think that Lua source code is unavailable if you read this thread.
- if you call 'setmetatable' on an object, and the metatable has a
__gc field the object is moved from the 'allgc' list to the 'finobj' list.
The FINALIZEDBIT is set in the object's GC state.
- when the object becomes unreachable it is moved back to the 'allgc'
list, and the FINALIZEDBIT is reset. The finalizer is then called.
- when the object becomes unreachable again (i.e. when it's on the
'allgc' list and there are no references to it) it's freed.
If you call 'setmetatable' in the finalizer, and that metatable has a
'__gc' field the object is moved back to the 'finobj' list, so the process
starts again and the object is never freed.
- an object is only finalized once, unless you explicitly call
'setmetatable' again to set the finalizer.
- it's not a bug, the original poster's program explicitly requests
that the object be finalized again using 'setmetatable'. The behavior is
consistent with the documentation.
- there are other things you can do in the __gc function, like
storing a reference to the object somewhere so it stays reachable. The
finalizer has run, but the object will have to be kept around anyway
because it just became reachable again. Having the finalizer 'decide' what
to do is therefor unsafe. Freeing an object whose finalizer has not run yet
requires two GC cycles, at the end of the first one the finalizer runs, at
the end of the second one the object is freed.
- It's not clear to me what the use case is for changing the
metatable in a finalizer to begin with. In the normal case you'd expect the
object to be unreachable after the finalizer returns, so it will be freed
in the next pass.
"It's not a bug, it's a feature!"
--
Gé
Philippe Verdy
2018-11-20 18:17:33 UTC
Permalink
As well:
setmetatable(o, {__gc =false})
would mean that the object can be finalized immediately: it would make the
object explicitly weak
It would still need to be marked: if it's still reachable, including
notably in the context where the previous statemyn is used, where it is
still reachable via (o) so it cannot be finalized before (o) gets out of
scope and no other references to (o) remains. So if (o) is marked by the
mark phase, it cannot be put into the finalization list.

So:
setmetatable(o, {__gc =false})
would be mostly equivalent to:
setmetatable(o, {__gc =nil})
or:
setmetatable(o, {})
or:
setmetatable(o, nil)

May be we can imagine another use for __gc=false (notably with a
generation-based GC: meaning don't finalize in the current generation, but
keep the object in the older generation, in which case __gc will be reset
automatically to nil, and then the GC running on the older generation will
allow finalizing it at this time: when the object will be old)

We could also tweak the value given to __gc (or the value returned by the
function) to mean we want the object to be part of specific generations
identifiable as any object;

This would be useful for example to create different pools, notably for
caches (Lua appliations would be able to manage efficiently their "cache
eviction policy", which is something very important to avoid DOS attacks
that attempt to clear caches used by concurrent threads, and to avoid time
attacks similar to Meltdown, measuring the time to honor requests, which is
shorter if an object is still in cache than when it is not because the
object has to be reconstructed, meaning that a third party can know if an
object was recently used by another thread).

To avoid Meltdown-like attacks, we must be able to restrict the cache
eviction by "segregating pools in caches": different pools are allocated
for different security contexts or different threads (Meltdown is not
affecting just CPUs, it concerns all computing systems that manage caches,
notably thoise using the very common LRU eviction policy). The GC in Lua
can easily become an easy target of Meltdown and DOS attacks if the
Lua-written software is used to service many users on the internet.
Post by Philippe Verdy
- they can still use getmetatable(o) and see the unmodiied metatable of
the object (no need then for the GC to temporarily set clear it or set it
to a dummy (different) metatable.
- they can use setmetatable(o,mt) as they want: no quirk needed in the
implementation of setmetatable (whic hthen does not need to care about the
fact a finalization of the object is pending)
- all is determined ONLY when the finalizer returns.
setmetatable(o, {__gc =function() return true end})
setmetatable(o, __gc =true)
(if we also honor a boolean value of __gc as equivalent to a function
returning a boolean) would be enough to say that the object "o" must NEVER
be finalized. (the alternative using a boolean would avoid the need to
perform any costly calls in repeated attempts to finalize the object, such
object would not even be in a finalization list, the GC would then
automatically consider the object as "marked", and reachable, the object
will never be freed, i.e. will remain permanent in memory)
Post by Philippe Verdy
So in summary, it is the effectively fact that we call
"setmetatable(o,mt)" on the object (o) passed in parameter to the finalizer
that effectively changes its state (a bit in the object).
In my opinion, this is not the best way to handle it
- this requires a specific behavior of setmetatable: it inspects the
metatatable to see if there's a __gc function in it, then sets the
FINALIZEDBIT if it is so, then returns to the finalizer, which may still
change the metatable after this.
- the only safe behavior would be that the finalizer **returns** an
effective status. For now finalizer are functions that return nothing (or
nil), they could return a non-nil value (probably non-false as well, e.g.
true) to indicate its desire to indicate that the FINALIZEDBIT must be
CLEARED on the object. A finalizer that does nothing (an empty function)
would not return anything, so the GC can still safely look at the metadata
at this time, see if it has a __gc function, and if not it will set the
FINALIZEDBIT (allowing the object to be swept and freed).
Post by Gé Weijers
You'd think that Lua source code is unavailable if you read this thread.
- if you call 'setmetatable' on an object, and the metatable has a
__gc field the object is moved from the 'allgc' list to the 'finobj' list.
The FINALIZEDBIT is set in the object's GC state.
- when the object becomes unreachable it is moved back to the
'allgc' list, and the FINALIZEDBIT is reset. The finalizer is then called.
- when the object becomes unreachable again (i.e. when it's on the
'allgc' list and there are no references to it) it's freed.
If you call 'setmetatable' in the finalizer, and that metatable has a
'__gc' field the object is moved back to the 'finobj' list, so the process
starts again and the object is never freed.
- an object is only finalized once, unless you explicitly call
'setmetatable' again to set the finalizer.
- it's not a bug, the original poster's program explicitly requests
that the object be finalized again using 'setmetatable'. The behavior is
consistent with the documentation.
- there are other things you can do in the __gc function, like
storing a reference to the object somewhere so it stays reachable. The
finalizer has run, but the object will have to be kept around anyway
because it just became reachable again. Having the finalizer 'decide' what
to do is therefor unsafe. Freeing an object whose finalizer has not run yet
requires two GC cycles, at the end of the first one the finalizer runs, at
the end of the second one the object is freed.
- It's not clear to me what the use case is for changing the
metatable in a finalizer to begin with. In the normal case you'd expect the
object to be unreachable after the finalizer returns, so it will be freed
in the next pass.
"It's not a bug, it's a feature!"
--
Gé
Tim Hill
2018-11-20 21:38:44 UTC
Permalink
generations identifiable as any object;
This would be useful for example to create different pools, notably for caches (Lua appliations would be able to manage efficiently their "cache eviction policy", which is something very important to avoid DOS attacks that attempt to clear caches used by concurrent threads, and to avoid time attacks similar to Meltdown, measuring the time to honor requests, which is shorter if an object is still in cache than when it is not because the object has to be reconstructed, meaning that a third party can know if an object was recently used by another thread).
In al this long thread, what specific problem are you trying to solve? It’s been established that the current GC behavior is by design and not a bug. And the various proposals around changes to __gc don’t seem to offer any new functionality.

—Tim
Sean Conner
2018-11-21 00:30:07 UTC
Permalink
Post by Tim Hill
generations identifiable as any object;
This would be useful for example to create different pools, notably for
caches (Lua appliations would be able to manage efficiently their "cache
eviction policy", which is something very important to avoid DOS attacks
that attempt to clear caches used by concurrent threads, and to avoid
time attacks similar to Meltdown, measuring the time to honor requests,
which is shorter if an object is still in cache than when it is not
because the object has to be reconstructed, meaning that a third party
can know if an object was recently used by another thread).
In al this long thread, what specific problem are you trying to solve?
It’s been established that the current GC behavior is by design and not a
bug. And the various proposals around changes to __gc don’t seem to offer
any new functionality.
As Dirk mention, Philippe may be a disciple of Bourbaki and is attempting
to bring more formalized rigor and abstraction to Lua.

-spc (Or maybe not ... I do not with to contradict the Great Philippe
Verdy with false inferences)
Philippe Verdy
2018-11-21 20:17:53 UTC
Permalink
Post by Sean Conner
As Dirk mention, Philippe may be a disciple of Bourbaki and is attempting
to bring more formalized rigor and abstraction to Lua.
-spc (Or maybe not ... I do not with to contradict the Great Philippe
Verdy with false inferences)
You made a false inference, because I don't know and don't have any contact
I can remember of with this "Bourbaki".

Anyway formalism in programming languages is extremely useful, it allows
finding design bugs and ambiguities, it allows to make the programming
language better, more predictable, more portable, and more secure.

Even if this was not an initial goal, the success of Lua will cause the
language to be scrupulously analyzed to find and solve its weaknesses or
inconsistencies. Lua is not finished at its 5.3 version (or the new alpha
version 5.4 currently tested...).

We can expect a major version 6.0 coming next (that will need to break some
compatibility with earlier implementations outside the limits that will be
documented and that are still not documented at all, or just implied
informally by some existing known implementations), and certainly other
versions to fix newly discovered inconsistencies or portability problems.

All programmers need precise definitions of the semantics and limits of
their favorite programming language, in order to know before trying to use
it, if it will solve their problem, or if their own development will fail
and will be finally abandonned, or will need to be significantly rewritten
from nearly zero taking into account the unsuspected limits with
workarounds or some complex additional library/layer.
Sean Conner
2018-11-21 20:25:51 UTC
Permalink
Post by Philippe Verdy
Post by Sean Conner
As Dirk mention, Philippe may be a disciple of Bourbaki and is attempting
to bring more formalized rigor and abstraction to Lua.
-spc (Or maybe not ... I do not with to contradict the Great Philippe
Verdy with false inferences)
You made a false inference, because I don't know and don't have any contact
I can remember of with this "Bourbaki".
That wasn't me, that was Dirk who made the inference, or did you miss the
"As Dirk mentioned" part?

-spc
Gé Weijers
2018-11-21 20:26:08 UTC
Permalink
Post by Philippe Verdy
You made a false inference, because I don't know and don't have any
contact I can remember of with this "Bourbaki".
Know your French mathematicians (even if they're 'virtual'):

https://fr.wikipedia.org/wiki/Nicolas_Bourbaki

http://www.bourbaki.ens.fr/
--
--
Gé
Philippe Verdy
2018-11-21 20:42:57 UTC
Permalink
I know this one, but there was nothing in your last messages that would
have told me that your refered specifically to him. I know lot of homonyms
named "Bourbaki" (including in France where this name is not uncommon).

I can give this example:
https://fr.wikipedia.org/wiki/Charles-Denis_Bourbaki
but there are others in French Wikipedia:
https://fr.wikipedia.org/wiki/Bourbaki
and many other people that don't have "their" article in French Wikipedia.

My feeling when reading your messages is that you could ahave received
messages on this list from some subcriber nicknamed "Bourbaki", and that's
why I said "I don't know and don't have any contact with "Bourbaki" (with
the explicit quotes: I could not figure who you were speaking about).

So now I can conclude you don't like this mathematician or its thesis, or
the fact that he works on formalism (with very successful results which
have very practical applications in many wellknown programming languages
and their implementations). I can just see that he has lot of famous
supporters, so I can reasonnably trust him for his work.
Post by Gé Weijers
Post by Philippe Verdy
You made a false inference, because I don't know and don't have any
contact I can remember of with this "Bourbaki".
https://fr.wikipedia.org/wiki/Nicolas_Bourbaki
http://www.bourbaki.ens.fr/
--
--
Gé
Luiz Henrique de Figueiredo
2018-11-21 21:20:27 UTC
Permalink
I think this thread has run its course. Let's move on.
Philippe Verdy
2018-11-21 21:21:09 UTC
Permalink
Note: mathematicians are very respected in France.

We even had a French Republic President (Henri Poincaré) that is known for
its mathematical results and fundamental research (also in physics), whose
results are now widely used in lot of scientific domains (including today's
computing), for example infinitesimal calculus, Lorentz transforms, optics,
theory of chaos, and early works on relativity (a base of works for
Einstein, Heisenberg and many other European searchers in fundamental
physics and mathematics...).

https://fr.wikipedia.org/wiki/Henri_Poincar%C3%A9

Today's group of mathematicians named "Bourbaki group" (meeting in ENS
Cachan, an university institute, with building in Paris and its southern
suburbs in region Île-de-France) chose the name in his honor, they also
work with another more recent group of searchers, named "Séminaire
Poincaré", created in 2001 on the model of the "Séminaire Bourbaki" but
with more focused topics/activities (with some mutual coverage in some
domains, so there are searchers that are members of both groups and they
regularly have joint meetings and contacts).

What would be today's industry of computing, without the formal work made
by mathematicians? It would only be made on unspecified assumptions : "if
this solves 80% of problems, let's use it and sell it, and ignore the
remaining 20%".

Today's computing industry is now very concerned by security problems that
are exploiting the forgotten 20%, left unspecified and not formalized:
these 20% rapidly transform to severe problem that rapidly converge to
concern almost 100% of computers usage, and can now impact almost everybody
living on earth, all economies, all public politics and general public
security, and a severe threat to peace and democracy.

Formalism cannot kill. It is a very good defense against abuses and it
protects all of us (even if not all of us can understand how it works or
what practical problems it solves constantly, and most often silently so
that users don't have to care much about it: the systems they use will
autodetect the errors, or unverified assumptions, and alert them, blocking
potentially dangerous effects which could be abused by some "blackhats"...
and are now effectively used by blackhats on very wide scale using "armies
of bots" which are not costly to create, but very hard and very costly to
stop. We can already see the growing scale of their damages, and their
severe impact: immediate economic losses, lack of reparation after damages,
or exploding costs of insurances, and finally: inflation of costs on all
products, or reduction of quality/usability/durability, or growing
production of waste, or restrictions of access to these products and
emergence of unfair discriminations, and finally emergence of extreme
politics).
Post by Philippe Verdy
I know this one, but there was nothing in your last messages that would
have told me that your refered specifically to him. I know lot of homonyms
named "Bourbaki" (including in France where this name is not uncommon).
https://fr.wikipedia.org/wiki/Charles-Denis_Bourbaki
https://fr.wikipedia.org/wiki/Bourbaki
and many other people that don't have "their" article in French Wikipedia.
My feeling when reading your messages is that you could ahave received
messages on this list from some subcriber nicknamed "Bourbaki", and that's
why I said "I don't know and don't have any contact with "Bourbaki" (with
the explicit quotes: I could not figure who you were speaking about).
So now I can conclude you don't like this mathematician or its thesis, or
the fact that he works on formalism (with very successful results which
have very practical applications in many wellknown programming languages
and their implementations). I can just see that he has lot of famous
supporters, so I can reasonnably trust him for his work.
Post by Gé Weijers
Post by Philippe Verdy
You made a false inference, because I don't know and don't have any
contact I can remember of with this "Bourbaki".
https://fr.wikipedia.org/wiki/Nicolas_Bourbaki
http://www.bourbaki.ens.fr/
--
--
Gé
Pierre Chapuis
2018-11-22 09:28:40 UTC
Permalink
Post by Philippe Verdy
We even had a French Republic President (Henri Poincaré) that is known
for its mathematical results and fundamental research (also in
physics)
This is *very* off-topic but I feel like I have to answer this for
personal reasons... :)
Henri Poincaré (the mathematician) and Raymond Poincaré (the president)
were different people (they were cousins).
There is no direct relationship I know of between Henri Poincaré and the
Bourbaki group, at least he was not a member.
Anyway, this has all gone way too far from Lua and its GC!
--
Pierre Chapuis
Philippe Verdy
2018-11-22 11:11:55 UTC
Permalink
Post by Philippe Verdy
We even had a French Republic President (Henri Poincaré) that is known for
its mathematical results and fundamental research (also in physics)
This is *very* off-topic but I feel like I have to answer this for
personal reasons... :)
Henri Poincaré (the mathematician) and Raymond Poincaré (the president)
were different people (they were cousins).
Oops sorry.

There is no direct relationship I know of between Henri Poincaré and the
Post by Philippe Verdy
Bourbaki group, at least he was not a member.
That's not what I said. I said that the Bourbaki group and the Poincaré
group are in strong relations and refer to each other. Both are linked to
their interest in the formal works made by the two mathematicians and are
named in honor of them.

And who spoke first about "Bourbaki" (without expliciting who he was) on
this list and on this topic ? That's not me.
Lorenzo Donati
2018-11-22 23:40:03 UTC
Permalink
Post by Gé Weijers
Post by Philippe Verdy
You made a false inference, because I don't know and don't have any
contact I can remember of with this "Bourbaki".
https://fr.wikipedia.org/wiki/Nicolas_Bourbaki
http://www.bourbaki.ens.fr/
Great! I was about to post almost the same message! :-)

Philippe Verdy
2018-11-19 03:50:56 UTC
Permalink
So If I uynderstand well, the metatable or table is not changed, what is
changed is only the presence of the object in the "list" of objects to be
finalized, which is filled at start of the mark phase with all known
objects, then removed from the list when they are reached from the stack
and marked as reachable

At end of the mark phase it remains a list of of unreachable objects that
will ned to be finalized; then the finalization step starts which takes
each object from the list and removes it, then it calls the finalizer if
there's one; but is there any action in the finalizer that determines that
the object will then be sweeped?

The ONLY action I see is the fact that it calls setmetatable(); you are
saying that this does NOT change the metatable, strange!

But it must also make something else and will then mark the object to be
not sweeped; however the call to setmetatable is not the end of the
finalizer which has still not returned to the GC sweeper; the finalizer may
still change the state of the metatable *after* calling setmetatable(), so
it could still set or remove its "__gc" entry. And there will nothing else
happening before the finalizer returns, so there will be nothing that can
actually set the required bit/flag property in the object itself properly.
Let's suppose that the GC then inspects the metatable at end to see if
there's a __gc entry mapped to a function: how can it determine that the
function called setmetatable() or changed the entry in its metatable and
differentiate it from the action of a finalizer that did nothing at all?
There must be an action taken by the finalizer to effectively indicate to
the GC that the object must not be sweeped and marked for later
finalization.

The finalizer may also resurrect that object by linking it to another
"live" object (i.e. a reachable object that has already been marked) and
also will not call any setmetatable, but it can also stil lset or reset the
__gc entry of its existing metatable.

All we know is that an object has a "state" which is active but still not
marked (possible only at start or during the marking phase, impossible
during the swep phase), active but marked, dead to finalize, finalized to
sweep, or resurrected (to be made active but still not marked again at end
of the sweep phase). This state is not enough to determine if what a
finalizer does (or does not do) will cause the object to be swept or to be
finalized again later.

The only reliable info is that, just before calling the finalizer, the GC
will clear the link of the object to its metatable: it is then up to the
finalizer to reattach the metatable by calling setmetatable with a suitable
__gc entry attached to a finalizer function (not necessarily the same
function as the current finalizer itself). If there's no such call to set
metatable, or if the finalizer clears the __gc entry or sets it to a
non-function, and if the object has not been resurrected by the finalizer
by linking it to a object with a "marked" status or an object with a "dead
to finalize" status (processed later in the same sweep cycle, then the
object will be swept by the GC just after the finalizer has returned.

That's what is not clearly documented: what is the effective status of the
object which differentiates an object being finalized to indicate to the GC
that it must not be swept after calling the finalizer? There must be an
action taken by the finalizer itself, but by default if this action is not
taken by the finalizer, then the finalization will be immediately followed
by sweeping.

And I only see the fact for a finalizer of calling setmetatable() to set or
restore the metatable which was detached from the object by the GC just
before calling the finalizer, simply by clearing the internal pointer to
the object's metatable, so when the finalizer will call setmetatable() to
set it to a non-nil value, this will have the desired effect of indicating
to the GC that the object must not be finalized

E.g.:
- a TCP network session socket that has been closed but is still kept for
about one minute in FIN_WAIT state, during which that socket may still be
resurrected, in order to reuse its allocated port number and allow fast
restart with its existing reception/transmission windows and MTU: this can
be useful for security against DOS attacks to avoid a server to eat all its
port number resources, but also for privacy reason to secure all sessions
- another usage is to allow closed files to have some delays before they
get flushed physically, or because the flush itself may be long and may
need to be tested and retried several times, before abandoning and logging
some severe errors to inform the user or the program itself that something
bad happened aynchronously without forcing the close() to be blocking until
flushing is fully completed.
- another usage may be to delay the power down of a previously used device
(e.g. turning off a screen display after several minutes when there was no
longer any new message to display), because turning on the device may be
very lengthy if it was turned off immediately after a close).
- another usage may be to unallocate other OS or external resources (e.g.
returning local memory used by Lua to the OS, by forcing all "weak" objects
to be deallocated, including for example caches, or deleting caches stored
in the filesystem that have expired a "grace delay" where they can still be
reused)
- another usage would be to start a
reorganization/optimization/defragmentation of the storage, or physicallly
storage entries that are no longer in use: this could be I/O intensive on
large volumes, and such clearing will be done after a grace period, where
it will be more easily performed with lower impact by performing it
sequentially instead of in random order on disk)
Basically finalizers are there to delay operations that can be postoned
without blocking the program that no longer needs immediately an object. It
still allows a program to reconstruct the object (notably weak" objects for
caches much faster if the underlying structures were not cleared and their
finalization was delayed for a grace period.

What you quoite explains is just that there are lists of objects from which
candidates are extracted, but it still does not indicate clearly which
action a finalizer takes to effectively change the state of the object so
that the GC will not sweep it when the finalizer will return. The GC must
then have already modified the state of the object (to indicate that it
MUST be swept) just before calling the finalizer and the finalizer takes an
optional decision to change again that state and indicate that now it MUST
NOT be swept by the GC: te finalizer itself cannot change the various lists
of objects maintained only by the GC itself, it cannot change its
"generation" models if generations are used in Lua 5.4 to subdivide the
lists of objects in smaller subsets, where GC and finalization will be
faster on live objects than objects in older generations that have survived
more than 1 cycle and are less likely of not needing to be swept rapidly).
Post by Philippe Verdy
It's not very well documented, but when a finalizer gets called on an
object, just before calling it, the GC first clears the associated
metatable if the object being finalized is a table: in the finalizer
for an object whose type is 'table' or 'userdata', if you use
getmetatable(self), it's not documented clearly if either you'll get
nil, or you'll get the same metatable whose "__gc" entry is now nill,
something that should be better, allowing you to store the "cnt"
variable inside the metatable itself along with the "__gc" variable,
instead of the object being finalized).
That's complete nonsense. Any modification of the metatable would be
unsafe as these are commonly used on several objects (though not in this
example), so the collection / finalization of the first such object
would break the finalization of all other objects with the same shared
metatable.
Post by Philippe Verdy
For an object (table or userdata) to be finalized when collected, you
must mark it for finalization. You mark an object for finalization
when you set its metatable and the metatable has a field indexed by
the string `"__gc"`. Note that if you set a metatable without a
`__gc` field and later create that field in the metatable, the object
will not be marked for finalization.
And §3
Post by Philippe Verdy
When a marked object becomes garbage, it is not collected immediately
by the garbage collector. Instead, Lua puts it in a list. After the
collection, Lua goes through that list. For each object in the list,
it checks the object's __gc metamethod: If it is a function, Lua
calls it with the object as its single argument; if the metamethod is
not a function, Lua simply ignores it.
And further §5
Post by Philippe Verdy
Because the object being collected must still be used by the
finalizer, that object (and other objects accessible only through
it) must be resurrected by Lua. Usually, this resurrection is
transient, and the object memory is freed in the next
garbage-collection cycle. However, if the finalizer stores the object
in some global place (e.g., a global variable), then the resurrection
is permanent. Moreover, if the finalizer marks a finalizing object
for finalization again, its finalizer will be called again in the
next cycle where the object is unreachable. In any case, the object
memory is freed only in a GC cycle where the object is unreachable
and not marked for finalization.
(I wouldn't call that "not very well documented"
)
If, when you setmetatable(), there's _anything_ non-nil at `__gc` in the
metatable, the thing gets flagged for finalization. (This is a property
of the table/userdata, not the metatable.)
When the thing is later collected and it has the "to be finalized" bit
set, this bit is cleared and, if _at this point_ the value at `__gc` in
the metatable is a function, that function gets run.
(And no matter what it'll do, the object survives until the next
collection. Now _usually_, the "to be finalized" bit isn't re-enabled
by the `__gc` method and so the thing will be collected normally by the
next cycle
 but you can re-flag it (by again calling setmetatable()
using a metatable with a `__gc` field), and even keep it around
indefinitely in an "undead" state – it's "dead" / fully unreachable from
the rest of the Lua state (hooks don't run during `__gc`), but it can
still do arbitrary stuff with the state.)
A fun / silly use of that is to make the computer beep on every
setmetatable( {}, { __gc = function(t)
io.stderr:write("\7") ; setmetatable(t,getmetatable(t)) end }
)
(This is easy to pre-load via the `-e` / `-l` options, and might be
useful for debugging
 in fact, the Lua tests do something similar, just
writing a '.' for every collection instead of making it beep.)
You might also (ab)use this to trigger bookkeeping tasks (once per GC
cycle), if you have no better way to do that. (A fixed "every $n
invocations of a function" scheme might not work (it could fire _both_
too rarely and too often, at different times), and in certain restricted
situations (games etc.), this might be as good as it gets
 but note that
this is slightly racy – _any_ allocation can trigger a GC cycle, so
protect your data structures / make sure you're not reading inconsistent
state when triggered in the middle of some change.)
And of course there's LOTS of other stuff that you can do

-- nobody
Loading...