Discussion:
auto classes and finalizers
Sean Kelly
2006-04-05 16:00:57 UTC
Permalink
I've been following a thread on GC in c.l.c++.m and something Herb
posted about C++/CLI today got me thinking:

- a type can have a destructor and/or a finalizer
- the destructor is called upon a) explicit delete or b) at end
of scope for auto objects
- the finalizer is called if allocated on the gc heap and the
destructor has not been called

Given that D can have lexical destruction of objects that weren't
explicitly designed for it, ie.

class C {}
auto C = new C();

Might it not be worthwhile to do something similar to the above? This
would allow objects to explicitly delete all their contained data in
instances where they are being used as auto objects, rather than always
relying on the GC for this purpose. I'll admit I don't particularly
like the idea of separate finalize() and ~this() methods, but it seems
an attractive enough feature that something along these lines may be
appropriate.


Sean
Mike Capp
2006-04-05 17:51:29 UTC
Permalink
In article <e10pk7$2khb$1 at digitaldaemon.com>, Sean Kelly says...
Post by Sean Kelly
I've been following a thread on GC in c.l.c++.m and something Herb
- a type can have a destructor and/or a finalizer
- the destructor is called upon a) explicit delete or b) at end
of scope for auto objects
- the finalizer is called if allocated on the gc heap and the
destructor has not been called
[snip]
Might it not be worthwhile to do something similar to the above? This
would allow objects to explicitly delete all their contained data in
instances where they are being used as auto objects, rather than always
relying on the GC for this purpose. I'll admit I don't particularly
like the idea of separate finalize() and ~this() methods, but it seems
an attractive enough feature that something along these lines may be
appropriate.
Personally I'm against it. I feel quite strongly that defining a destructor (or
finalizer) should be illegal for a GC type - it should only be allowed for a
class declared as 'auto'. If you need dtor-like behaviour, you should not be
using GC, and the compiler should tell you so.

I posted this opinion some weeks back in a similar discussion here, expecting to
be chased out of town with pitchforks, but the response was very positive.
Nobody could think of any counterexamples, at any rate.

cheers
Mike
kris
2006-04-06 07:28:43 UTC
Permalink
Post by Mike Capp
In article <e10pk7$2khb$1 at digitaldaemon.com>, Sean Kelly says...
Post by Sean Kelly
I've been following a thread on GC in c.l.c++.m and something Herb
- a type can have a destructor and/or a finalizer
- the destructor is called upon a) explicit delete or b) at end
of scope for auto objects
- the finalizer is called if allocated on the gc heap and the
destructor has not been called
[snip]
Might it not be worthwhile to do something similar to the above? This
would allow objects to explicitly delete all their contained data in
instances where they are being used as auto objects, rather than always
relying on the GC for this purpose. I'll admit I don't particularly
like the idea of separate finalize() and ~this() methods, but it seems
an attractive enough feature that something along these lines may be
appropriate.
Personally I'm against it. I feel quite strongly that defining a destructor (or
finalizer) should be illegal for a GC type - it should only be allowed for a
class declared as 'auto'. If you need dtor-like behaviour, you should not be
using GC, and the compiler should tell you so.
I posted this opinion some weeks back in a similar discussion here, expecting to
be chased out of town with pitchforks, but the response was very positive.
Nobody could think of any counterexamples, at any rate.
cheers
Mike
Mike;

Instead of making the dtor illegal for GC types, why not remove the
'auto' keyword from this realm altogether, and just use the existance of
a dtor as the class RAII indicator?

Thus, any class with a dtor is automatically RAII. When the dtor is
actually invoked, all relevant GC allocations should still be intact; yes?

What to do about those classes that need a dtor-like construct, but
cannot be deemed RAII? Be explicit about closing them, using the close()
or dispose() approach.

Thoughts?

- Kris
Mike Capp
2006-04-06 08:34:59 UTC
Permalink
In article <e12fva$29gr$1 at digitaldaemon.com>, kris says...
Post by kris
Mike;
Instead of making the dtor illegal for GC types, why not remove the
'auto' keyword from this realm altogether, and just use the existance of
a dtor as the class RAII indicator?
The trouble is that this wouldn't make the RAII behaviour apparent to somebody
reading the code. They'd have to go and look at the class definition. I'm happy
to do a little extra typing for the sake of code clarity here, in the same way
that I thought C#'s insistence on having "in" and "ref" arguments marked as such
by calls as well as decls was a nice touch.
Post by kris
What to do about those classes that need a dtor-like construct, but
cannot be deemed RAII? Be explicit about closing them, using the close()
or dispose() approach.
Can you give some concrete examples of such 'awkward' classes? I'm not saying
they don't exist, but I'm not assuming that they must, either. The "dispose"
(anti-)pattern is, frankly, awful. It's "Wrong By Default" taken to the extreme.


cheers
Mike
Georg Wrede
2006-04-06 11:49:28 UTC
Permalink
Post by Dave
kris says...
Post by kris
Instead of making the dtor illegal for GC types, why not remove the
'auto' keyword from this realm altogether, and just use the
existance of a dtor as the class RAII indicator?
The trouble is that this wouldn't make the RAII behaviour apparent to
somebody reading the code. They'd have to go and look at the class
definition. I'm happy to do a little extra typing for the sake of
code clarity here, in the same way that I thought C#'s insistence on
having "in" and "ref" arguments marked as such by calls as well as
decls was a nice touch.
FWIW, I fully agree.
Post by Dave
Post by kris
What to do about those classes that need a dtor-like construct, but
cannot be deemed RAII? Be explicit about closing them, using the
close() or dispose() approach.
Can you give some concrete examples of such 'awkward' classes? I'm
not saying they don't exist, but I'm not assuming that they must,
either. The "dispose" (anti-)pattern is, frankly, awful. It's "Wrong
By Default" taken to the extreme.
Yes, and thinking of a class that needs "destructing", which then may
happen much later (at GC time), or never at all -- is just insanity.
kris
2006-04-06 18:48:07 UTC
Permalink
Post by Mike Capp
In article <e12fva$29gr$1 at digitaldaemon.com>, kris says...
Post by kris
Mike;
Instead of making the dtor illegal for GC types, why not remove the
'auto' keyword from this realm altogether, and just use the existance of
a dtor as the class RAII indicator?
The trouble is that this wouldn't make the RAII behaviour apparent to somebody
reading the code. They'd have to go and look at the class definition. I'm happy
to do a little extra typing for the sake of code clarity here, in the same way
that I thought C#'s insistence on having "in" and "ref" arguments marked as such
by calls as well as decls was a nice touch.
Yes, that is true.
Post by Mike Capp
Post by kris
What to do about those classes that need a dtor-like construct, but
cannot be deemed RAII? Be explicit about closing them, using the close()
or dispose() approach.
Can you give some concrete examples of such 'awkward' classes? I'm not saying
they don't exist, but I'm not assuming that they must, either.
Well, pretty much anything intended to be long-lived within the program,
yet the OS cannot clean up by default. This includes external hardware
which should be reset or otherwise released and, more commonly, various
types of scant resources used for purposes of optimization ~ Regan noted
database resources, which are a good example. Others might include
termination network-handshaking, and so on. Such things are often
wrapped via a class, with the expectation said class can encapsulate the
cleanup process. Their scope (or life expectancy) is often intended to
span a considerable period of time.

In some cases it might be possible to arrange the code such that these
entities are actually scoped on the stack (for RAII purposes), where the
enclosing function doesn't exit until termination time. However, others
often have a life expectancy based upon "activity" ~ a classic example
might be cached database resources, where life-expectancy of the object
has nothing to do with scope per se, but is instead often based upon a
period of dormancy or inactivity.
Post by Mike Capp
The "dispose"
(anti-)pattern is, frankly, awful. It's "Wrong By Default" taken to the extreme.
This is the option left open after the discovery that dtor() is pretty
much worthless. I agree that a better solution is needed.
Regan Heath
2006-04-06 23:58:39 UTC
Permalink
Post by kris
Post by Mike Capp
Post by kris
What to do about those classes that need a dtor-like construct, but
cannot be deemed RAII? Be explicit about closing them, using the
close() or dispose() approach.
Can you give some concrete examples of such 'awkward' classes? I'm not saying
they don't exist, but I'm not assuming that they must, either.
Well, pretty much anything intended to be long-lived within the program,
yet the OS cannot clean up by default. This includes external hardware
which should be reset or otherwise released and, more commonly, various
types of scant resources used for purposes of optimization ~ Regan noted
database resources, which are a good example.
I did, I also suggested some solutions:
http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36462

- refrence counting.
- a new 'shared' keyword.

The idea in that thread (isn't really a new idea) is essentially what Kris
Post by kris
Post by Mike Capp
Post by kris
Instead of making the dtor illegal for GC types, why not remove the
'auto' keyword from this realm altogether, and just use the existance
of a dtor as the class RAII indicator?
In my case; removal of 'auto' from object instance declaration, but
requiring it on class definitions when a dtor is present. Plus requiring
it on classes containing classes which are 'auto'.

After all a dtor indicates some (non-memory) cleanup needs to be done,
making it RAII be definition, no? And any class containing a reference
that needs cleanup, will itself need cleanup, right?

I think we need to try and come up with some examples of where it can't
work, and/or decide what the limitations are and if they're an
inappropriate cost to pay for what I think could be quite a safe system to
write RAII in.
Post by kris
Post by Mike Capp
The trouble is that this wouldn't make the RAII behaviour apparent to somebody
reading the code. They'd have to go and look at the class definition. I'm happy
to do a little extra typing for the sake of code clarity here, in the same way
that I thought C#'s insistence on having "in" and "ref" arguments marked as such
by calls as well as decls was a nice touch.
IMO the benefit outweights this cost. Much like it does for 'out' etc
function parameters.

Regan
Daniel Keep
2006-04-17 07:21:34 UTC
Permalink
Where to attach this post... aah well, this seems as good a spot as any,
I guess...

I won't pretend I'm an expert in these things, but it seems to me that
adding reference counting to D's wide range of memory management options
would solve most of these problems, yes?

The main case for keeping dtors with GCed objects is that sometimes you
have an object that needs to be cleaned up in some fashion, but which
isn't (or can't easily be) tied to a particular stack frame. If you
made this class reference counted, then it would be cleaned up the
second the last reference goes out of scope.

The common drawback is the argument that you then have to watch out for
cycles, but Python seems to be coping fine--it has a generational cycle
checker as far as I understand it, and I've seen papers for creating
thread-safe generational checkers so that wouldn't need to be a problem.

I think having lazy GC, RAII, manual memory management and ref. counting
would cover just about everything you could possibly want to do.

Plus, it'd be a great gloating point: "D: memory management YOUR way!"

-- Daniel

P.S. I beg forgiveness if I've oversimplified this.
Post by Regan Heath
Post by kris
Post by Mike Capp
Post by kris
What to do about those classes that need a dtor-like construct, but
cannot be deemed RAII? Be explicit about closing them, using the
close() or dispose() approach.
Can you give some concrete examples of such 'awkward' classes? I'm not saying
they don't exist, but I'm not assuming that they must, either.
Well, pretty much anything intended to be long-lived within the
program, yet the OS cannot clean up by default. This includes external
hardware which should be reset or otherwise released and, more
commonly, various types of scant resources used for purposes of
optimization ~ Regan noted database resources, which are a good example.
http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36462
- refrence counting.
- a new 'shared' keyword.
The idea in that thread (isn't really a new idea) is essentially what
Post by kris
Post by Mike Capp
Post by kris
Instead of making the dtor illegal for GC types, why not remove the
'auto' keyword from this realm altogether, and just use the
existance of a dtor as the class RAII indicator?
In my case; removal of 'auto' from object instance declaration, but
requiring it on class definitions when a dtor is present. Plus requiring
it on classes containing classes which are 'auto'.
After all a dtor indicates some (non-memory) cleanup needs to be done,
making it RAII be definition, no? And any class containing a reference
that needs cleanup, will itself need cleanup, right?
I think we need to try and come up with some examples of where it can't
work, and/or decide what the limitations are and if they're an
inappropriate cost to pay for what I think could be quite a safe system
to write RAII in.
Post by kris
Post by Mike Capp
The trouble is that this wouldn't make the RAII behaviour apparent to somebody
reading the code. They'd have to go and look at the class definition. I'm happy
to do a little extra typing for the sake of code clarity here, in the same way
that I thought C#'s insistence on having "in" and "ref" arguments marked as such
by calls as well as decls was a nice touch.
IMO the benefit outweights this cost. Much like it does for 'out' etc
function parameters.
Regan
--
v1sw5+8Yhw5ln4+5pr6OFma8u6+7Lw4Tm6+7l6+7D
a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP http://hackerkey.com/
kris
2006-04-10 01:17:03 UTC
Permalink
I thought it worthwhile to review the dtor behaviour and view the
concerns from a different direction:

dtor 'state' valid:
- explicit invocation via delete keyword
- explicit invocation via raii

dtor state 'unspecified':
- implicitly called when no more references are held to the object
- implicitly called when a program terminates


Just for fun, let's assume the 'unspecified' issue cannot be resolved.
Let's also assume there are dtors which expect to "clean up", and which
will fail when the dtor state is 'unspecified'.

What happens when a programmer forgets to explicitly delete such an
object? Well, the program is highly likely to fail (or be in an
inconsistent state) after the GC collects said object. This might be
before or during program termination.

How does one ensure this cannot occur? One obvious method would be for
the GC to /not/ invoke any dtor by default. While the GC would still
collect, such a change would ensure it cannot be the cause of a failing
program (it would also make the GC a little faster, but that's probably
beside the point).

Assuming that were the case, we're left with only the two cases where
cleanup is explicit and the dtor state is 'valid': via the delete
keyword, and via raii (both of which apply the same functionality).

This would tend to relieve the need for an explicit dispose() pattern,
since the dtor is now the equivalent?

What about implicit cleanup? In this scenario, it doesn't happen. If you
don't explicitly (via delete or via raii) delete an object, the dtor is
not invoked. This applies the notion that it's better to have a leak
than a dead program. The leak is a bug to be resolved.

What would be really nice is a tool to tell us about such leaks. It
should be possible for the GC (when configured to do so) to identify
collected objects which have a non-default dtor. In other words, the GC
can probably tell if a custom dtor is present (it has a different
address than a default dtor?). If the GC finds one of these during a
normal collection cycle, and is about to collect it, it might raise a
runtime error to indicate the leak instance?

Anyway ~ to summarize, this would have the following effect:

1) no more bogus crashes due to dtors being invoked in an invalid state
2) no need for the dispose() pattern
3) normal collection does not invoke dtors, making it a little faster
4) there's a possibility of a tool to identify and capture leaking
resources. Something which would be handy anyway.


For the sake of example: "unscoped" resources, such as connection-pools,
would operate per normal in this scenario: the pool elements should be
deleted explicitly by the hosting pool (or be treated as leaks, if they
have a custom dtor). The pool itself would have to be deleted explicitly
also ~ as is currently the case today ~ which can optionally be handled
via a module-dtor.

Thoughts?
Bruno Medeiros
2006-04-10 12:33:56 UTC
Permalink
Post by kris
I thought it worthwhile to review the dtor behaviour and view the
- explicit invocation via delete keyword
- explicit invocation via raii
- implicitly called when no more references are held to the object
- implicitly called when a program terminates
Just for fun, let's assume the 'unspecified' issue cannot be resolved.
Let's also assume there are dtors which expect to "clean up", and which
will fail when the dtor state is 'unspecified'.
What happens when a programmer forgets to explicitly delete such an
object? Well, the program is highly likely to fail (or be in an
inconsistent state) after the GC collects said object. This might be
before or during program termination.
How does one ensure this cannot occur? One obvious method would be for
the GC to /not/ invoke any dtor by default. While the GC would still
collect, such a change would ensure it cannot be the cause of a failing
program (it would also make the GC a little faster, but that's probably
beside the point).
Assuming that were the case, we're left with only the two cases where
cleanup is explicit and the dtor state is 'valid': via the delete
keyword, and via raii (both of which apply the same functionality).
This would tend to relieve the need for an explicit dispose() pattern,
since the dtor is now the equivalent?
What about implicit cleanup? In this scenario, it doesn't happen. If you
don't explicitly (via delete or via raii) delete an object, the dtor is
not invoked. This applies the notion that it's better to have a leak
than a dead program. The leak is a bug to be resolved.
What would be really nice is a tool to tell us about such leaks. It
should be possible for the GC (when configured to do so) to identify
collected objects which have a non-default dtor. In other words, the GC
can probably tell if a custom dtor is present (it has a different
address than a default dtor?). If the GC finds one of these during a
normal collection cycle, and is about to collect it, it might raise a
runtime error to indicate the leak instance?
1) no more bogus crashes due to dtors being invoked in an invalid state
2) no need for the dispose() pattern
3) normal collection does not invoke dtors, making it a little faster
4) there's a possibility of a tool to identify and capture leaking
resources. Something which would be handy anyway.
For the sake of example: "unscoped" resources, such as connection-pools,
would operate per normal in this scenario: the pool elements should be
deleted explicitly by the hosting pool (or be treated as leaks, if they
have a custom dtor). The pool itself would have to be deleted explicitly
also ~ as is currently the case today ~ which can optionally be handled
via a module-dtor.
Thoughts?
All of those pros you mention are valid. But you'd have one serious con:
* Any class which required cleanup would have to be manually memory managed.
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
kris
2006-04-10 18:43:38 UTC
Permalink
Post by Bruno Medeiros
Post by kris
I thought it worthwhile to review the dtor behaviour and view the
- explicit invocation via delete keyword
- explicit invocation via raii
- implicitly called when no more references are held to the object
- implicitly called when a program terminates
Just for fun, let's assume the 'unspecified' issue cannot be resolved.
Let's also assume there are dtors which expect to "clean up", and
which will fail when the dtor state is 'unspecified'.
What happens when a programmer forgets to explicitly delete such an
object? Well, the program is highly likely to fail (or be in an
inconsistent state) after the GC collects said object. This might be
before or during program termination.
How does one ensure this cannot occur? One obvious method would be for
the GC to /not/ invoke any dtor by default. While the GC would still
collect, such a change would ensure it cannot be the cause of a
failing program (it would also make the GC a little faster, but that's
probably beside the point).
Assuming that were the case, we're left with only the two cases where
cleanup is explicit and the dtor state is 'valid': via the delete
keyword, and via raii (both of which apply the same functionality).
This would tend to relieve the need for an explicit dispose() pattern,
since the dtor is now the equivalent?
What about implicit cleanup? In this scenario, it doesn't happen. If
you don't explicitly (via delete or via raii) delete an object, the
dtor is not invoked. This applies the notion that it's better to have
a leak than a dead program. The leak is a bug to be resolved.
What would be really nice is a tool to tell us about such leaks. It
should be possible for the GC (when configured to do so) to identify
collected objects which have a non-default dtor. In other words, the
GC can probably tell if a custom dtor is present (it has a different
address than a default dtor?). If the GC finds one of these during a
normal collection cycle, and is about to collect it, it might raise a
runtime error to indicate the leak instance?
1) no more bogus crashes due to dtors being invoked in an invalid state
2) no need for the dispose() pattern
3) normal collection does not invoke dtors, making it a little faster
4) there's a possibility of a tool to identify and capture leaking
resources. Something which would be handy anyway.
For the sake of example: "unscoped" resources, such as
connection-pools, would operate per normal in this scenario: the pool
elements should be deleted explicitly by the hosting pool (or be
treated as leaks, if they have a custom dtor). The pool itself would
have to be deleted explicitly also ~ as is currently the case today ~
which can optionally be handled via a module-dtor.
Thoughts?
* Any class which required cleanup would have to be manually memory managed.
Thanks;

First, let's change the verbiage of "valid" and "unspecified" to be
"deterministic" and "non-deterministic" respectively (per Don C).

This makes it clear that a dtor invoked /lazily/ by the GC will be
invoked in a non-deterministic state (how the GC works today). This
non-deterministic state means that it's very likely any or all
gc-managed references held purely by a class instance will already be
collected when the relevant dtor is invoked.

The other aspect to consider is the timeliness of cleanup. Mike suggests
that classes that actually have something to cleanup should do so in a
timely manner, and that the indicator for this is the presence of a dtor.

To get to your assertion: under the suggested model, any class with
resources that need to be released should either be 'delete'd at some
appropriate point, or have raii applied to it. Classes with dtors that
are not cleaned up in this manner can be treated as "leaks" (and can be
identified at runtime).

Thus, the term "manually memory managed" is not as clear as it might be:
raii can be used to clean up, and scope(exit) can be used to cleanup. An
explicit 'delete' can be used to cleanup. There's no malloc() or
anything like that invoved.

The truly serious problem with a 'lazy' cleanup is that the dtor will
wind up invoked with non-determinstic state (typically leading to a
serious error). The other concern with lazy cleanup is what Mike
addresses (if the resource needs cleaning up, it should be done in a
timely manner ~ not at some arbitrary point in the future).

What would be an example of a class requiring cleanup, which should be
performed lazily? I can't think of a reasonable one off-hand, but let's
take an example anyway:

Suppose I have a class that holds a file-handle. This handle should be
released when the class is no longer in use. Luckily, the file-handle
does not require to be GC-managed itself (can be held by the class as an
integer). This provides us with two choices ~ release the handle in a
timely fashion, or release it at some undetermined point in the future
(when the class is collected). We're lucky to have a choice here; it's
actually something of a special case.

The model suggested follows Mike's proposal that the file-handle should
actually be released as soon as reasonably possible. RAII can be used to
ensure that happens automagically. What happens if said class is not
raii, and it not hit with a 'delete'? The suggested model can easily
identify that class instance as a "leak" when collected by the GC, and
report it as such. That is: instead of the GC-collector invoking the
dtor with a non-deterministic state, it instead identifies a leaking
resource.

As far as automatic cleanup goes, I think D is already well armed via
raii and the scope() idiom. Adopting an attitude of cleaning up
resources in a timely manner will surely only be of benefit in the long
run?

Another approach here is to allow the collector to invoke the dtor (as
it does today), and somehow ensure that its state is fully deterministic
(which is not done today). I suspect that would be notably more
expensive and/or difficult to achieve? However, that also does not
address Mike's concern about timely cleanup, which I think is of valid
concern. Thus, I really like the simplicity of the model as described
above. It also has the added bonus of eliminating the need for a
redundant dispose() pattern, and makes the GC a little faster :-)

- Kris
Bruno Medeiros
2006-04-13 17:25:18 UTC
Permalink
Post by kris
Post by Bruno Medeiros
* Any class which required cleanup would have to be manually memory managed.
Just one addendum: I was just pointing out that con, I wasn't saying it
was or was not, a bad idea overall.
Post by kris
First, let's change the verbiage of "valid" and "unspecified" to be
"deterministic" and "non-deterministic" respectively (per Don C).
Let's not. *g* See my reply to the Don.
Post by kris
To get to your assertion: under the suggested model, any class with
resources that need to be released should either be 'delete'd at some
appropriate point, or have raii applied to it. Classes with dtors that
are not cleaned up in this manner can be treated as "leaks" (and can be
identified at runtime).
raii can be used to clean up, and scope(exit) can be used to cleanup. An
explicit 'delete' can be used to cleanup. There's no malloc() or
anything like that invoved.
Those are all manual memory management. (Even if auto and scope() are
much better than plain malloc/free).
[Note: RAII's auto = scope(exit)]
You would have an automatic leak/failure detection, true.
Post by kris
The truly serious problem with a 'lazy' cleanup is that the dtor will
wind up invoked with non-determinstic state (typically leading to a
serious error). The other concern with lazy cleanup is what Mike
addresses (if the resource needs cleaning up, it should be done in a
timely manner ~ not at some arbitrary point in the future).
The state is *undefined*, it is not "non-deterministic" nor
"deterministic". This is the kind of terminology blur up that I was
leery of. :P
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
kris
2006-04-13 17:54:35 UTC
Permalink
Post by Bruno Medeiros
Post by kris
Post by Bruno Medeiros
* Any class which required cleanup would have to be manually memory managed.
Just one addendum: I was just pointing out that con, I wasn't saying it
was or was not, a bad idea overall.
Post by kris
First, let's change the verbiage of "valid" and "unspecified" to be
"deterministic" and "non-deterministic" respectively (per Don C).
Let's not. *g* See my reply to the Don.
heheh :)
Post by Bruno Medeiros
Post by kris
To get to your assertion: under the suggested model, any class with
resources that need to be released should either be 'delete'd at some
appropriate point, or have raii applied to it. Classes with dtors that
are not cleaned up in this manner can be treated as "leaks" (and can
be identified at runtime).
Thus, the term "manually memory managed" is not as clear as it might
be: raii can be used to clean up, and scope(exit) can be used to
cleanup. An explicit 'delete' can be used to cleanup. There's no
malloc() or anything like that invoved.
Those are all manual memory management. (Even if auto and scope() are
much better than plain malloc/free).
[Note: RAII's auto = scope(exit)]
You would have an automatic leak/failure detection, true.
Post by kris
The truly serious problem with a 'lazy' cleanup is that the dtor will
wind up invoked with non-determinstic state (typically leading to a
serious error). The other concern with lazy cleanup is what Mike
addresses (if the resource needs cleaning up, it should be done in a
timely manner ~ not at some arbitrary point in the future).
The state is *undefined*, it is not "non-deterministic" nor
"deterministic". This is the kind of terminology blur up that I was
leery of. :P
:-D

Terminology aside; with the current implementation, invocation of dtors
during a collection often causes serious problems. That's why we see the
use of close/dispose patterns in D. It would be great to avoid both of
those things :p
kris
2006-04-10 19:15:20 UTC
Permalink
Post by Bruno Medeiros
* Any class which required cleanup would have to be manually memory managed.
Can anyone come up with some examples whereby a class needs to cleanup,
and also /needs/ to be collected lazily? In other words, where raii or
delete could not be applied appropriately?
Sean Kelly
2006-04-10 19:57:37 UTC
Permalink
Post by kris
Post by Bruno Medeiros
* Any class which required cleanup would have to be manually memory managed.
Can anyone come up with some examples whereby a class needs to cleanup,
and also /needs/ to be collected lazily? In other words, where raii or
delete could not be applied appropriately?
Well, there are plenty of instances where the lifetime of an object
isn't bound to a specific owner or scope--consider connection objects
for a server app. However, in most cases it's possible (and correct) to
delegate cleanup responsibility to a specific manager object or to link
it to the occurrence of some specific event. So far as
non-deterministic cleanup via dtors is concerned, I think it's mostly
implemented as a fail-safe. And it may be more correct to signal an
error if such an object is encountered via a GC run than to simply clean
it up silently, as a careful programmer might consider this a resource leak.


Sean
kris
2006-04-10 20:29:56 UTC
Permalink
Post by Sean Kelly
Post by kris
Post by Bruno Medeiros
* Any class which required cleanup would have to be manually memory managed.
Can anyone come up with some examples whereby a class needs to
cleanup, and also /needs/ to be collected lazily? In other words,
where raii or delete could not be applied appropriately?
Well, there are plenty of instances where the lifetime of an object
isn't bound to a specific owner or scope--consider connection objects
for a server app. However, in most cases it's possible (and correct) to
delegate cleanup responsibility to a specific manager object or to link
it to the occurrence of some specific event.
Aye
Post by Sean Kelly
So far as
non-deterministic cleanup via dtors is concerned, I think it's mostly
implemented as a fail-safe. And it may be more correct to signal an
error if such an object is encountered via a GC run than to simply clean
it up silently, as a careful programmer might consider this a resource leak.
Yes; that's how I feel about it also. Especially when the "silent"
cleanup leads to SegFaults and such. Intended as a fail-safe, but
actually a failure-causation ;-)
Georg Wrede
2006-04-10 23:52:19 UTC
Permalink
Post by Sean Kelly
Post by kris
Post by Bruno Medeiros
* Any class which required cleanup would have to be manually memory managed.
Can anyone come up with some examples whereby a class needs to
cleanup, and also /needs/ to be collected lazily? In other words,
where raii or delete could not be applied appropriately?
Well, there are plenty of instances where the lifetime of an object
isn't bound to a specific owner or scope--consider connection objects
for a server app. However, in most cases it's possible (and correct) to
delegate cleanup responsibility to a specific manager object or to link
it to the occurrence of some specific event. So far as
non-deterministic cleanup via dtors is concerned, I think it's mostly
implemented as a fail-safe. And it may be more correct to signal an
error if such an object is encountered via a GC run than to simply clean
it up silently, as a careful programmer might consider this a resource leak.
Writing this kind of code demands that the programmer keeps (in his
mind) a clear picture of _who_ owns the instance.

Getting that unclear is a sure receipe for disaster.
Georg Wrede
2006-04-10 23:43:56 UTC
Permalink
Post by kris
Post by Bruno Medeiros
* Any class which required cleanup would have to be manually memory managed.
Can anyone come up with some examples whereby a class needs to cleanup,
and also /needs/ to be collected lazily? In other words, where raii or
delete could not be applied appropriately?
Got another idea.

It seems to me that this discussion is pretty abstract. Normally, half
the participants would be talking about Apples and the other about
Oranges, without neither noticing. But in this D newsgroup, I believe
the state of knowledge is high enough for such not to happen.

However, half of the _audience_ may not be that clear on that both
apples and oranges belong to the Class Magnoliopsida, and one of them to
the Order Rosales and the other to Sapindales. But which? (And I
certainly admit I belong to this Audience here.)

To serve and accomodate all, and to even possibly start to get
potentially worthwhile commentary from a larger group of eyes, I suggest
we try to construct the simplest Structure of Instances needed to
display _all_ of the discussed woes.

As a first draft (and not even remotely pretending it is adequate), I
cast the following:

VIEW THIS IN MONOSPACE FONT
===========================

code heap


iRa -----------------> alpha ---> beta
^ /
\ /
\ /
\ V
gamma
^ ^
/ \
/ \
/ \
V V
iRb ----------------> delta <--> epsilon

(Oh, iR stands for Instance Reference, just to not get involved with the
types or classes:

SomeClass iRx = new SomeClass(); // Create a reference to an instance.
)

So, the upper half makes a singly linked list and the lower half makes a
doubly linked list, and then there arem two references (or D variables)
pointing to the Alpha and Delta instances.

Can this structure demosnstrate _all_ of the problems we're currently
discussing, or should it be more complicated?
Regan Heath
2006-04-10 23:25:31 UTC
Permalink
On Mon, 10 Apr 2006 13:33:56 +0100, Bruno Medeiros
Post by Bruno Medeiros
* Any class which required cleanup would have to be manually memory managed.
Not memory managed, surely.. the memory will still be collected by the GC,
all that changes is that the dtor is not invoked when that happens.. or at
least that is how I understood Kris's proposal.

Regan
Bruno Medeiros
2006-04-13 16:48:11 UTC
Permalink
Post by Regan Heath
On Mon, 10 Apr 2006 13:33:56 +0100, Bruno Medeiros
Post by Bruno Medeiros
* Any class which required cleanup would have to be manually memory managed.
Not memory managed, surely.. the memory will still be collected by the
GC, all that changes is that the dtor is not invoked when that happens..
or at least that is how I understood Kris's proposal.
Regan
Kris clearly mentioned that a class with a dtor (i.e. a class needing
cleanup) being collected by the GC would be an abnormal situation.
(which could, or not, be detected by the runtime.)
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Sean Kelly
2006-04-13 16:52:31 UTC
Permalink
Post by Bruno Medeiros
Post by Regan Heath
On Mon, 10 Apr 2006 13:33:56 +0100, Bruno Medeiros
Post by Bruno Medeiros
* Any class which required cleanup would have to be manually memory managed.
Not memory managed, surely.. the memory will still be collected by the
GC, all that changes is that the dtor is not invoked when that
happens.. or at least that is how I understood Kris's proposal.
Kris clearly mentioned that a class with a dtor (i.e. a class needing
cleanup) being collected by the GC would be an abnormal situation.
(which could, or not, be detected by the runtime.)
The version of Ares released yesterday has code in place to do this.
For now, you'll have to alter the finalizer if you wanted to do
something special (dmdrt/memory.d:cr_finalize), but eventually it it
will probably call an onFinalizeError function in the standard library
that can be hooked in a similar manner to onAssertError. The error will
be signaled when the GC collects an object that has a dtor. Default
behavior will likely be to do ignore it and move on.


Sean
Jarrett Billingsley
2006-04-05 20:41:22 UTC
Permalink
"Sean Kelly" <sean at f4.ca> wrote in message
Post by Sean Kelly
- a type can have a destructor and/or a finalizer
- the destructor is called upon a) explicit delete or b) at end of
scope for auto objects
- the finalizer is called if allocated on the gc heap and the
destructor has not been called
Would you mind explaining why exactly there needs to be a difference between
destructors and finalizers? I've been following all the arguments about
this heap vs. auto classes and dtors vs. finalizers, and I still can't
figure out why destructors _can't be the finalizers_. Do finalizers do
something fundamentally different from destructors?
Sean Kelly
2006-04-05 21:20:00 UTC
Permalink
Post by Jarrett Billingsley
"Sean Kelly" <sean at f4.ca> wrote in message
Post by Sean Kelly
- a type can have a destructor and/or a finalizer
- the destructor is called upon a) explicit delete or b) at end of
scope for auto objects
- the finalizer is called if allocated on the gc heap and the
destructor has not been called
Would you mind explaining why exactly there needs to be a difference between
destructors and finalizers? I've been following all the arguments about
this heap vs. auto classes and dtors vs. finalizers, and I still can't
figure out why destructors _can't be the finalizers_. Do finalizers do
something fundamentally different from destructors?
Since finalizers are called when the GC destroys an object, they are
very limited in what they can do. They can't assume any GC managed
object they have a reference to is valid, etc. By contrast, destructors
can make this assumption, because the object is being destroyed
deterministically. I think having both may be too confusing to be
worthwhile, but it would allow for things like this:

class LinkedList {
~this() { // called deterministically
for( Node n = top; n; ) {
Node t = n->next;
delete n;
n = t;
}
finalize();
}

void finalize() { // called by GC
// nodes may have already been destroyed
// so leave them alone, but special
// resources could be reclaimed
}
}

The argument against finalizers, as Mike mentioned, is that you
typically want to reclaim such special resources deterministically, so
letting the GC take care of this 'someday' is of questionable utility.


Sean
kris
2006-04-05 22:13:17 UTC
Permalink
Post by Sean Kelly
Post by Jarrett Billingsley
"Sean Kelly" <sean at f4.ca> wrote in message
Post by Sean Kelly
- a type can have a destructor and/or a finalizer
- the destructor is called upon a) explicit delete or b) at end
of scope for auto objects
- the finalizer is called if allocated on the gc heap and the
destructor has not been called
Would you mind explaining why exactly there needs to be a difference
between destructors and finalizers? I've been following all the
arguments about this heap vs. auto classes and dtors vs. finalizers,
and I still can't figure out why destructors _can't be the
finalizers_. Do finalizers do something fundamentally different from
destructors?
Since finalizers are called when the GC destroys an object, they are
very limited in what they can do. They can't assume any GC managed
object they have a reference to is valid, etc. By contrast, destructors
can make this assumption, because the object is being destroyed
deterministically. I think having both may be too confusing to be
class LinkedList {
~this() { // called deterministically
for( Node n = top; n; ) {
Node t = n->next;
delete n;
n = t;
}
finalize();
}
void finalize() { // called by GC
// nodes may have already been destroyed
// so leave them alone, but special
// resources could be reclaimed
}
}
The argument against finalizers, as Mike mentioned, is that you
typically want to reclaim such special resources deterministically, so
letting the GC take care of this 'someday' is of questionable utility.
Yes, it is. The "death tractors" (dtors in D) are notably less than
useful right now. Any dependencies are likely in an unknown state (as
you note), and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object?
It's actually quite hard to nail down when they /are/ invoked :)

Regardless; any "special resources" one would, somewhat naturally, wish
to cleanup via dtors have to be explicitly managed via other means. This
usually means a global application-list of "special stuff", which does
not seem to jive with OOP very well?

On the face of it, it shouldn't be hard for the GC to invloke dtors in
such a manner whereby dependencies are preserved ~ that would at least
help. But then, the whole notion is somewhat worthless (in D) when it's
implemented as a non-deterministic activity.

Given all that, the finalizer behaviour mentioned above sounds rather
like the current death-tractor behaviour?
Jarrett Billingsley
2006-04-05 22:50:33 UTC
Permalink
Yes, it is. The "death tractors" (dtors in D) are notably less than useful
right now. Any dependencies are likely in an unknown state (as you note),
and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object?
It's actually quite hard to nail down when they /are/ invoked :)
They are invoked when you call delete. This is how you do the deterministic
"list of special stuff" that you mention - you just 'delete' them all,
perhaps in a certain order.

In fact, the dtors are also called on program exit - as long as they're not
in some kind of array. I don't know if that's a bug, or by design, or a
foggy area of the spec, or a combination of all of the above.
Regardless; any "special resources" one would, somewhat naturally, wish
to cleanup via dtors have to be explicitly managed via other means. This
usually means a global application-list of "special stuff", which does not
seem to jive with OOP very well?
I kind of agree with you, but at the same time, I just take the stance that
although it's useful, _the GC can't be trusted_. Unless a custom GC is
written for every program and every possible arrangement of data, it can't
know in what order to call dtors/finalizers and whatnot. So I do end up
keeping lists of all types of objects that I want to be called
deterministically, and delete them on program exit. I just leave the simple
/ common stuff (throwaway class instances, string crap) to the GC. That
just makes me feel a lot better and safer.

In addition, I usually don't assume that any references a class holds are
valid in the dtor. I leave the cleanup of other objects (like in Sean's
example) to the other objects' dtors.
On the face of it, it shouldn't be hard for the GC to invloke dtors in
such a manner whereby dependencies are preserved ~ that would at least
help. But then, the whole notion is somewhat worthless (in D) when it's
implemented as a non-deterministic activity.
Yeah, I was thinking about that, maybe instead of just looping through all
class instances linearly and deleting everything, just keep running GC
passes until the regular GC pass has no effect, and brute force the rest.
In this way, my method of "not deleting other objects in dtors" would delete
the instance of the LinkedList on the first pass, and then all the Nodes on
the second, since they are now orphaned.
kris
2006-04-05 23:40:58 UTC
Permalink
Post by Jarrett Billingsley
Yes, it is. The "death tractors" (dtors in D) are notably less than useful
right now. Any dependencies are likely in an unknown state (as you note),
and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object?
It's actually quite hard to nail down when they /are/ invoked :)
They are invoked when you call delete. This is how you do the deterministic
"list of special stuff" that you mention - you just 'delete' them all,
perhaps in a certain order.
I ended up using my own 'finalizer' since, back in the day, delete
didn't invoke the dtor. It does now, so that's something. Objects that
refer to anything external should probably have a close() method anyway
~ which gets us back to what Mike had noted.
Post by Jarrett Billingsley
In fact, the dtors are also called on program exit - as long as they're not
in some kind of array. I don't know if that's a bug, or by design, or a
foggy area of the spec, or a combination of all of the above.
Interesting. It does appear to do that now, whereas in the past it
didn't. I remember a post from someone complaining that it took 5
minutes for his program to exit because the GC was run to completion on
all 10,000,000,000 objects he had (or something like that). The "fix"
for that appeared to be "just don't cleanup on exit", which then
sidestepped all dtors. It seems something changed along the way, since
dtors do indeed get invoked at program termination for a simple test
program (not if an exception is thrown, though). My bad.

Does this happen consistently, then? I mean, are dtors invoked on all
remaining Objects during exit? At all times? Is that even a good idea?
Post by Jarrett Billingsley
Regardless; any "special resources" one would, somewhat naturally, wish
to cleanup via dtors have to be explicitly managed via other means. This
usually means a global application-list of "special stuff", which does not
seem to jive with OOP very well?
I kind of agree with you, but at the same time, I just take the stance that
although it's useful, _the GC can't be trusted_. Unless a custom GC is
written for every program and every possible arrangement of data, it can't
know in what order to call dtors/finalizers and whatnot. So I do end up
keeping lists of all types of objects that I want to be called
deterministically, and delete them on program exit. I just leave the simple
/ common stuff (throwaway class instances, string crap) to the GC. That
just makes me feel a lot better and safer.
The GC is supposed to be your friend :)

That doesn't mean it should know about your design but, there again, it
shouldn't abort it either. That implies any additional GC references
held by a dtor Object really should be valid whenever that dtor is
invoked. The fact that they're not relegates dtors to having
insignificant value ~ which somehow doesn't seem right. Frankly, I don't
clearly understand why they're in D at all ~ too little consistency.
Post by Jarrett Billingsley
In addition, I usually don't assume that any references a class holds are
valid in the dtor. I leave the cleanup of other objects (like in Sean's
example) to the other objects' dtors.
On the face of it, it shouldn't be hard for the GC to invloke dtors in
such a manner whereby dependencies are preserved ~ that would at least
help. But then, the whole notion is somewhat worthless (in D) when it's
implemented as a non-deterministic activity.
Yeah, I was thinking about that, maybe instead of just looping through all
class instances linearly and deleting everything, just keep running GC
passes until the regular GC pass has no effect, and brute force the rest.
In this way, my method of "not deleting other objects in dtors" would delete
the instance of the LinkedList on the first pass, and then all the Nodes on
the second, since they are now orphaned.
Yep

If references were still valid for dtors, and dtors were invoked in a
deterministic manner, perhaps all we'd need is something similar to
"scope(exit)", but referring to a global scope instead? Should the
memory manager take care of the latter?
Dave
2006-04-06 02:49:24 UTC
Permalink
In article <e11ki9$rtq$1 at digitaldaemon.com>, kris says...
Post by Jarrett Billingsley
Yes, it is. The "death tractors" (dtors in D) are notably less than useful
right now. Any dependencies are likely in an unknown state (as you note),
and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object?
It's actually quite hard to nail down when they /are/ invoked :)
They are invoked when you call delete. This is how you do the deterministic
"list of special stuff" that you mention - you just 'delete' them all,
perhaps in a certain order.
Ok, so for non-auto death tractors (that name is great):

a) non-auto D class dtors are actually what are called finalizers everywhere
else, except when delete is explicitly called.
b) although dtors are eventually all called, it is non-deterministic unless the
class is auto, or delete is used explicitly.
c) unless dtors are called deterministically, they could often be considered
worthless since, w/ a GC handling memory, the primary reason for dtor's is to
release other expensive external resources.
d) there is (alot of) overhead involved with 'dtors for every class'.
e) All this has been a major sticking-point of other languages and runtimes
(like VB & C#.NET). Because of c) and d), in those languages, the workaround
they use is finalizers instead of dtors (they also have Dispose, but that needs
to be called explicitly), and using(...) takes the place of auto/delete. IIRC,
exactly when these finalizers are called is always non-deterministic and not
even guaranteed unless an explicit "full collect" is done, and a big part of
this is precisely because it's so expensive. Although I program in those
languages day to day, because of this, I don't rely on anything that is going on
behind the scenes as I've always ended-up explicitly "finalizing" things myself
rather than relying on the GC or the using(...) statement. If you've done alot
of DB work in .NET (for example), then you'll know that doing this is sometimes
as bothersome as malloc/free or new/delete (and Thank God for .NET's
try/finally). That is a major reason I think finalizers are useless unless
they're always deterministic.
From some tests I've done in the past and recently duplicated in
http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36258, just attempting
to set a finalizer is damned expensive, and a lot of that expense is because
setFinalizer needs to be synchronized. IIRC, in the tests I've run in the past,
if the finalizer overhead is removed, the current GC can actually run as fast
for smallish class objects over several collections, as new/delete for C++
classes or malloc/free for C structs.

There is not only an expense involved in setting the finalizer, but the way it
works in the current D GC is that there is overhead involved in every collection
checking for finalizers, even for non-class objects. It looks to me like if all
the non-deterministic finalization cruft could be removed from the GC, the
*current* GC may actually be a little faster than malloc/free for class objects
(at least moderately sized ones).

Long and short of it is I like Mike's ideas regarding allowing dtors for only
auto classes. In that way, the GC wouldn't have to deal with finalizers at all,
or at least during non-deterministic collections. It would also still allow D to
claim RAII because 'auto' classes are something new for D compared to most other
languages.

It may be that taking care of the finalizer overhead issue is a must if D GC's
will ever be able to perform as well as other languages for class objects.

Kind-of ironic; the goals of D are to be as powerful as C++, yet make compilers
relatively easy to develop - but a side effect of those two is that really good
GC's may be harder to develop than the compilers <g>

- Dave
kris
2006-04-06 03:02:32 UTC
Permalink
Post by Dave
In article <e11ki9$rtq$1 at digitaldaemon.com>, kris says...
Post by Jarrett Billingsley
Yes, it is. The "death tractors" (dtors in D) are notably less than useful
right now. Any dependencies are likely in an unknown state (as you note),
and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object?
It's actually quite hard to nail down when they /are/ invoked :)
They are invoked when you call delete. This is how you do the deterministic
"list of special stuff" that you mention - you just 'delete' them all,
perhaps in a certain order.
a) non-auto D class dtors are actually what are called finalizers everywhere
else, except when delete is explicitly called.
b) although dtors are eventually all called, it is non-deterministic unless the
class is auto, or delete is used explicitly.
c) unless dtors are called deterministically, they could often be considered
worthless since, w/ a GC handling memory, the primary reason for dtor's is to
release other expensive external resources.
d) there is (alot of) overhead involved with 'dtors for every class'.
e) All this has been a major sticking-point of other languages and runtimes
(like VB & C#.NET). Because of c) and d), in those languages, the workaround
they use is finalizers instead of dtors (they also have Dispose, but that needs
to be called explicitly), and using(...) takes the place of auto/delete. IIRC,
exactly when these finalizers are called is always non-deterministic and not
even guaranteed unless an explicit "full collect" is done, and a big part of
this is precisely because it's so expensive. Although I program in those
languages day to day, because of this, I don't rely on anything that is going on
behind the scenes as I've always ended-up explicitly "finalizing" things myself
rather than relying on the GC or the using(...) statement. If you've done alot
of DB work in .NET (for example), then you'll know that doing this is sometimes
as bothersome as malloc/free or new/delete (and Thank God for .NET's
try/finally). That is a major reason I think finalizers are useless unless
they're always deterministic.
From some tests I've done in the past and recently duplicated in
http://www.digitalmars.com/drn-bin/wwwnews?digitalmars.D/36258, just attempting
to set a finalizer is damned expensive, and a lot of that expense is because
setFinalizer needs to be synchronized. IIRC, in the tests I've run in the past,
if the finalizer overhead is removed, the current GC can actually run as fast
for smallish class objects over several collections, as new/delete for C++
classes or malloc/free for C structs.
There is not only an expense involved in setting the finalizer, but the way it
works in the current D GC is that there is overhead involved in every collection
checking for finalizers, even for non-class objects. It looks to me like if all
the non-deterministic finalization cruft could be removed from the GC, the
*current* GC may actually be a little faster than malloc/free for class objects
(at least moderately sized ones).
Long and short of it is I like Mike's ideas regarding allowing dtors for only
auto classes. In that way, the GC wouldn't have to deal with finalizers at all,
or at least during non-deterministic collections. It would also still allow D to
claim RAII because 'auto' classes are something new for D compared to most other
languages.
I could buy that too, if the darned "auto" keyword weren't so overloaded :-P

[snip]
Jarrett Billingsley
2006-04-06 04:08:43 UTC
Permalink
"Dave" <Dave_member at pathlink.com> wrote in message
Post by Dave
Long and short of it is I like Mike's ideas regarding allowing dtors for only
auto classes. In that way, the GC wouldn't have to deal with finalizers at all,
or at least during non-deterministic collections. It would also still allow D to
claim RAII because 'auto' classes are something new for D compared to most other
languages.
Hmm. 'auto' works well and good for classes whose references are local
variables, but .. what about objects whose lifetimes aren't determined by
the return of a function?

I.e. the Node class is used only in LinkedList. When a LinkedList is
killed, all its Nodes must die as well. Since the Node references are kept
in the LinkedList and not as local variables, there's no way to specify
'auto' for them.

Then you start getting into a catch-22. Okay, so you need to delete all
those child Nodes in the dtor of LinkedList, meaning that LinkedList has to
be made auto so it can have a dtor. But what if a linked list reference has
to exist at global level, or in a struct? There is no function return to
determine when to delete the list. So you have to make LinkedList non-auto,
but then that means that you can't delete all those child nodes since you
don't have a dtor / finalizer, etc..

I think RAII is nice, but it doesn't seem to fix everything. Unless, of
course, it were extended to deal with these odd cases.
Regan Heath
2006-04-06 04:59:06 UTC
Permalink
On Thu, 6 Apr 2006 00:08:43 -0400, Jarrett Billingsley <kb3ctd2 at yahoo.com>
Post by Jarrett Billingsley
"Dave" <Dave_member at pathlink.com> wrote in message
Post by Dave
Long and short of it is I like Mike's ideas regarding allowing dtors for only
auto classes. In that way, the GC wouldn't have to deal with finalizers
at
all,
or at least during non-deterministic collections. It would also still allow D to
claim RAII because 'auto' classes are something new for D compared to
most
other
languages.
Hmm. 'auto' works well and good for classes whose references are local
variables, but .. what about objects whose lifetimes aren't determined by
the return of a function?
I.e. the Node class is used only in LinkedList. When a LinkedList is
killed, all its Nodes must die as well.
Assuming the nodes contain reference(s) to resources (other than memory)
that need to be released, right? You don't need to delete them to free
memory, the GC should free them eventually.

The same is true for any non-auto object which contains a sub-object which
has a reference to a resource that must be released deterministically.
Isn't the solution therefore to make every object containing an 'auto'
object 'auto' as well.

How about this:

1) If a class has a dtor it must be auto, eg.

class A { ~this() {} } //error A must be auto
auto class A { ~this() {} } //ok


2) If a class contains a reference to an auto class, it must also be auto,
eg.

class B { A a; } //error A is auto, B must be auto too
auto class B { A a; ~this() { delete a; } } //ok

2a) If that class does not have a dtor it is an error.
2b) If that dtor does not delete the 'a' reference it is an error.

Speculative:
Can the compiler in fact auto-generate a dtor for this class? One that
deletes all auto references.
Can it append (not prepend) that auto-generated dtor to any user supplied
one?


3) Remove the other 'auto' class syntax, i.e.

class A {}
auto A a = new A();

It's either a class with resources that need to be freed, or it's not. Is
there any need for a middle ground?
(this also removes the double use of auto, that'll make some people happy)


Pros:
1. no more weird crashes in dtors where people reference things which are
gone.
2. compiler finds/corrects most reference leaks automatically.
3. no more double use of 'auto'.

Cons:
1. less flexible?


I can already think of a situation where this might be too inflexible.
What happens if you want to share an object between multiple objects, for
example:
auto class DatabaseConnection {}

a singelton style shared connection to a database. You have several
classes which share that connection, i.e.
class UserQuery { DatabaseConnection c; }

using the rules above these classes would either be illegal, or get a dtor
which auto-deletes the DatabaseConnection.

The solution? Perhaps it's reference counting in the DatabaseConnection?
Perhaps it's a new syntax to mark something 'shared', preventing the
compiler auto-deleting it. eg.
class UserQuery { shared DatabaseConnection c; }

Perhaps this cure is worse than the disease? Thoughts?

Regan
kris
2006-04-06 05:15:42 UTC
Permalink
Post by Jarrett Billingsley
"Dave" <Dave_member at pathlink.com> wrote in message
Post by Dave
Long and short of it is I like Mike's ideas regarding allowing dtors for only
auto classes. In that way, the GC wouldn't have to deal with finalizers at all,
or at least during non-deterministic collections. It would also still allow D to
claim RAII because 'auto' classes are something new for D compared to most other
languages.
Hmm. 'auto' works well and good for classes whose references are local
variables, but .. what about objects whose lifetimes aren't determined by
the return of a function?
I.e. the Node class is used only in LinkedList. When a LinkedList is
killed, all its Nodes must die as well. Since the Node references are kept
in the LinkedList and not as local variables, there's no way to specify
'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they
are also managed by the GC :)

So, as I understand it, one cannot legitimately execute that example.

[snip]
Sean Kelly
2006-04-06 18:50:23 UTC
Permalink
Post by kris
Post by Jarrett Billingsley
Hmm. 'auto' works well and good for classes whose references are
local variables, but .. what about objects whose lifetimes aren't
determined by the return of a function?
I.e. the Node class is used only in LinkedList. When a LinkedList is
killed, all its Nodes must die as well. Since the Node references are
kept in the LinkedList and not as local variables, there's no way to
specify 'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if they
are also managed by the GC :)
So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)


Sean
kris
2006-04-06 18:57:18 UTC
Permalink
Post by Sean Kelly
Post by kris
Post by Jarrett Billingsley
Hmm. 'auto' works well and good for classes whose references are
local variables, but .. what about objects whose lifetimes aren't
determined by the return of a function?
I.e. the Node class is used only in LinkedList. When a LinkedList is
killed, all its Nodes must die as well. Since the Node references
are kept in the LinkedList and not as local variables, there's no way
to specify 'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if
they are also managed by the GC :)
So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)
Sean
<g> Touch? !
Georg Wrede
2006-04-06 19:22:10 UTC
Permalink
Post by kris
Post by Sean Kelly
Post by kris
Post by Jarrett Billingsley
Hmm. 'auto' works well and good for classes whose references are
local variables, but .. what about objects whose lifetimes aren't
determined by the return of a function?
I.e. the Node class is used only in LinkedList. When a LinkedList
is killed, all its Nodes must die as well. Since the Node
references are kept in the LinkedList and not as local variables,
there's no way to specify 'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if
they are also managed by the GC :)
So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)
<g> Touch? !
Hey, hey, hey...

If anybody deletes stuff from a linked list, isn't it their
responsibility to fix the pointers of the previous and/or the next item,
to "bypass" that item??????

The mere fact that no "outside" references exist to a particular item in
a linked list does _not_ make this item eligible for GC.

Not in the current implementation, and I dare say, in no future
implementation ever.

In other words, it is _guaranteed_ that _all_ items in a linked list are
valid.

This could be called a "linked-list-invariant". :-)
Lars Ivar Igesund
2006-04-06 19:29:32 UTC
Permalink
Post by Georg Wrede
Post by kris
Post by Sean Kelly
Post by kris
Post by Jarrett Billingsley
Hmm. 'auto' works well and good for classes whose references are
local variables, but .. what about objects whose lifetimes aren't
determined by the return of a function?
I.e. the Node class is used only in LinkedList. When a LinkedList
is killed, all its Nodes must die as well. Since the Node
references are kept in the LinkedList and not as local variables,
there's no way to specify 'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if
they are also managed by the GC :)
So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)
<g> Touch? !
Hey, hey, hey...
If anybody deletes stuff from a linked list, isn't it their
responsibility to fix the pointers of the previous and/or the next item,
to "bypass" that item??????
The mere fact that no "outside" references exist to a particular item in
a linked list does _not_ make this item eligible for GC.
Not in the current implementation, and I dare say, in no future
implementation ever.
In other words, it is _guaranteed_ that _all_ items in a linked list are
valid.
Not if the linked list is circular (such that all items is linked to), but
disjoint from the roots kept by the GC. This memory will be lost to a
conservative GC, but can be detected some of the other types around.
Georg Wrede
2006-04-06 22:14:43 UTC
Permalink
Post by Lars Ivar Igesund
Post by Georg Wrede
Post by kris
Post by Sean Kelly
Post by kris
Post by Jarrett Billingsley
Hmm. 'auto' works well and good for classes whose references are
local variables, but .. what about objects whose lifetimes aren't
determined by the return of a function?
I.e. the Node class is used only in LinkedList. When a LinkedList
is killed, all its Nodes must die as well. Since the Node
references are kept in the LinkedList and not as local variables,
there's no way to specify 'auto' for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being valid if
they are also managed by the GC :)
So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)
<g> Touch? !
Hey, hey, hey...
If anybody deletes stuff from a linked list, isn't it their
responsibility to fix the pointers of the previous and/or the next item,
to "bypass" that item??????
The mere fact that no "outside" references exist to a particular item in
a linked list does _not_ make this item eligible for GC.
Not in the current implementation, and I dare say, in no future
implementation ever.
In other words, it is _guaranteed_ that _all_ items in a linked list are
valid.
Not if the linked list is circular (such that all items is linked to), but
disjoint from the roots kept by the GC. This memory will be lost to a
conservative GC, but can be detected some of the other types around.
If the linked list is circular, and at the same time there's no
reference to this list from any GC examined area, then I'd consider this
as a Programmer Fault.

Any set of "items", none of which is referenced from a "roots" area, is
IMHO eligible for deletion. Whether this set is circular or not.

In other words, we should not strive to make the GC "too smart" for its
own good. Either we see to it that items not wished for deletion are
pointed to, or we accept that non-pointed-to items are considered passe.
Georg Wrede
2006-04-07 06:19:20 UTC
Permalink
Post by Lars Ivar Igesund
Post by Georg Wrede
Post by kris
Post by Sean Kelly
Post by kris
Post by Jarrett Billingsley
Hmm. 'auto' works well and good for classes whose
references are local variables, but .. what about objects
whose lifetimes aren't determined by the return of a
function?
I.e. the Node class is used only in LinkedList. When a
LinkedList is killed, all its Nodes must die as well.
Since the Node references are kept in the LinkedList and
not as local variables, there's no way to specify 'auto'
for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being
valid if they are also managed by the GC :)
So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)
<g> Touch? !
Hey, hey, hey...
If anybody deletes stuff from a linked list, isn't it their
responsibility to fix the pointers of the previous and/or the next
item, to "bypass" that item??????
The mere fact that no "outside" references exist to a particular
item in a linked list does _not_ make this item eligible for GC.
Not in the current implementation, and I dare say, in no future
implementation ever.
In other words, it is _guaranteed_ that _all_ items in a linked
list are valid.
Not if the linked list is circular (such that all items is linked
to), but disjoint from the roots kept by the GC. This memory will be
lost to a conservative GC, but can be detected some of the other
types around.
The mere existence of a circular list that is not pointed-to from the
outside, is a programmer error. Unless one explicitly wants it to be
collected. But even then it's a programmer error if the items need
destructing, since the collection may or may not happen "ever".

So, in practice, whenever one wants to store items that need destructors
in a linked list, the list itself should be encapsulated in a class that
can guarantee the timely destruction of the items, as opposed to merely
abandoning them.
Lars Ivar Igesund
2006-04-07 06:49:06 UTC
Permalink
Post by Georg Wrede
Post by Lars Ivar Igesund
Post by Georg Wrede
Post by kris
Post by Sean Kelly
Post by kris
Post by Jarrett Billingsley
Hmm. 'auto' works well and good for classes whose
references are local variables, but .. what about objects
whose lifetimes aren't determined by the return of a
function?
I.e. the Node class is used only in LinkedList. When a
LinkedList is killed, all its Nodes must die as well.
Since the Node references are kept in the LinkedList and
not as local variables, there's no way to specify 'auto'
for them.
Heck, the LinkedList dtor /cannot/ rely on the nodes being
valid if they are also managed by the GC :)
So, as I understand it, one cannot legitimately execute that example.
...unless the LinkedList has a deterministic lifetime :-)
<g> Touch? !
Hey, hey, hey...
If anybody deletes stuff from a linked list, isn't it their
responsibility to fix the pointers of the previous and/or the next
item, to "bypass" that item??????
The mere fact that no "outside" references exist to a particular
item in a linked list does _not_ make this item eligible for GC.
Not in the current implementation, and I dare say, in no future
implementation ever.
In other words, it is _guaranteed_ that _all_ items in a linked
list are valid.
Not if the linked list is circular (such that all items is linked
to), but disjoint from the roots kept by the GC. This memory will be
lost to a conservative GC, but can be detected some of the other
types around.
The mere existence of a circular list that is not pointed-to from the
outside, is a programmer error. Unless one explicitly wants it to be
collected. But even then it's a programmer error if the items need
destructing, since the collection may or may not happen "ever".
Maybe it is a programmer's error, but at the same time a programmer expect a
GC to collect memory that is no longer referenced by the program. Also the
list might be generated by a complex enough program to actually make it
difficult to see that it is a circular linked list. Depending on the GC, it
might or might not be able to reclaim this memory (or calling the
destructors/finalizers of the objects in the list), because it no longer
explicitly know about it.
Sean Kelly
2006-04-06 04:42:06 UTC
Permalink
Post by kris
Interesting. It does appear to do that now, whereas in the past it
didn't. I remember a post from someone complaining that it took 5
minutes for his program to exit because the GC was run to completion on
all 10,000,000,000 objects he had (or something like that). The "fix"
for that appeared to be "just don't cleanup on exit", which then
sidestepped all dtors. It seems something changed along the way, since
dtors do indeed get invoked at program termination for a simple test
program (not if an exception is thrown, though). My bad.
Does this happen consistently, then? I mean, are dtors invoked on all
remaining Objects during exit? At all times? Is that even a good idea?
Yes, yes, yes, maybe :-) It's the call to gc.fullCollectNoStack in
gc_term. There are alternatives that might work nearly as well (ie. the
techniques you've used in the past) if shutdown time is an issue.
Post by kris
That doesn't mean it should know about your design but, there again, it
shouldn't abort it either. That implies any additional GC references
held by a dtor Object really should be valid whenever that dtor is
invoked. The fact that they're not relegates dtors to having
insignificant value ~ which somehow doesn't seem right. Frankly, I don't
clearly understand why they're in D at all ~ too little consistency.
Because not having them tends to inspire people to invent their own,
like the dispose() convention in Java. Having language support is
preferable, even if the functionality isn't terrific.
Post by kris
Post by Jarrett Billingsley
Yeah, I was thinking about that, maybe instead of just looping through
all class instances linearly and deleting everything, just keep
running GC passes until the regular GC pass has no effect, and brute
force the rest. In this way, my method of "not deleting other objects
in dtors" would delete the instance of the LinkedList on the first
pass, and then all the Nodes on the second, since they are now orphaned.
Yep
If references were still valid for dtors, and dtors were invoked in a
deterministic manner, perhaps all we'd need is something similar to
"scope(exit)", but referring to a global scope instead? Should the
memory manager take care of the latter?
In most cases this would work, but what about orphaned cycles? The GC
would ultimately just have to pick a place to start. Also, I think
disentangling a complex web of references could be somewhat time
intensive, and collection runs are already too slow :-)


Sean
kris
2006-04-06 05:08:55 UTC
Permalink
Post by Sean Kelly
Post by kris
Interesting. It does appear to do that now, whereas in the past it
didn't. I remember a post from someone complaining that it took 5
minutes for his program to exit because the GC was run to completion
on all 10,000,000,000 objects he had (or something like that). The
"fix" for that appeared to be "just don't cleanup on exit", which then
sidestepped all dtors. It seems something changed along the way, since
dtors do indeed get invoked at program termination for a simple test
program (not if an exception is thrown, though). My bad.
Does this happen consistently, then? I mean, are dtors invoked on all
remaining Objects during exit? At all times? Is that even a good idea?
Yes, yes, yes, maybe :-) It's the call to gc.fullCollectNoStack in
gc_term. There are alternatives that might work nearly as well (ie. the
techniques you've used in the past) if shutdown time is an issue.
Post by kris
That doesn't mean it should know about your design but, there again,
it shouldn't abort it either. That implies any additional GC
references held by a dtor Object really should be valid whenever that
dtor is invoked. The fact that they're not relegates dtors to having
insignificant value ~ which somehow doesn't seem right. Frankly, I
don't clearly understand why they're in D at all ~ too little
consistency.
Because not having them tends to inspire people to invent their own,
like the dispose() convention in Java. Having language support is
preferable, even if the functionality isn't terrific.
Right :)

That's why what Mike suggests make sense to me ~ only have dtor support
for those classes that can actually take advantage of it, and have that
enforced by the compiler. If, for example, one could also instantiate
RAII classes at the global scope, then that would take care of loose
ends too.

If that also makes the GC execute faster, then so much the better.
Post by Sean Kelly
Post by kris
Post by Jarrett Billingsley
Yeah, I was thinking about that, maybe instead of just looping
through all class instances linearly and deleting everything, just
keep running GC passes until the regular GC pass has no effect, and
brute force the rest. In this way, my method of "not deleting other
objects in dtors" would delete the instance of the LinkedList on the
first pass, and then all the Nodes on the second, since they are now
orphaned.
Yep
If references were still valid for dtors, and dtors were invoked in a
deterministic manner, perhaps all we'd need is something similar to
"scope(exit)", but referring to a global scope instead? Should the
memory manager take care of the latter?
In most cases this would work, but what about orphaned cycles? The GC
would ultimately just have to pick a place to start. Also, I think
disentangling a complex web of references could be somewhat time
intensive, and collection runs are already too slow :-)
I'd assumed it already followed a dependency tree to figure out the
collectable allocations? But even so, it's probably better to not do any
of that at all (and do what Mike suggests instead).
Sean Kelly
2006-04-05 22:50:17 UTC
Permalink
Post by kris
Yes, it is. The "death tractors" (dtors in D) are notably less than
useful right now. Any dependencies are likely in an unknown state (as
you note), and then, dtors are not invoked when the program exits. From
what I recall, dtors are not even invoked when you "delete" an object?
It's actually quite hard to nail down when they /are/ invoked :)
I think dtors are called whenever an object is destroyed, be it via
delete or by the GC. And the GC should perform a complete clean-up on
app termination. I believe this is the current behavior in both Phobos
and Ares (look at internal/gc/gc.d:gc_term() in Phobos and
dmdrt/memory.d:gc_term() in Ares for the shutdown cleanup code).
Post by kris
Given all that, the finalizer behaviour mentioned above sounds rather
like the current death-tractor behaviour?
It is exactly. The dtor behavior has simply changed to be suitable for
a more effective clean-up whenever the object is destroyed
deterministically (ie. via delete or as an auto object). I suppose an
alternative would be to pass a state flag to the dtor to indicate the
manner of disposal? I really can't think of a means of implementing
this that is as elegant as D deserves.


Sean
Jarrett Billingsley
2006-04-05 22:40:30 UTC
Permalink
"Sean Kelly" <sean at f4.ca> wrote in message
Since finalizers are called when the GC destroys an object, they are very
limited in what they can do. They can't assume any GC managed object they
have a reference to is valid, etc. By contrast, destructors can make this
assumption, because the object is being destroyed deterministically. I
think having both may be too confusing to be worthwhile, but it would
The argument against finalizers, as Mike mentioned, is that you typically
want to reclaim such special resources deterministically, so letting the
GC take care of this 'someday' is of questionable utility.
Thank you for that clear, concise, and un-condescending reply :)
Bruno Medeiros
2006-04-09 18:18:12 UTC
Permalink
Post by Sean Kelly
Post by Jarrett Billingsley
"Sean Kelly" <sean at f4.ca> wrote in message
Post by Sean Kelly
- a type can have a destructor and/or a finalizer
- the destructor is called upon a) explicit delete or b) at end
of scope for auto objects
- the finalizer is called if allocated on the gc heap and the
destructor has not been called
Would you mind explaining why exactly there needs to be a difference
between destructors and finalizers? I've been following all the
arguments about this heap vs. auto classes and dtors vs. finalizers,
and I still can't figure out why destructors _can't be the
finalizers_. Do finalizers do something fundamentally different from
destructors?
Since finalizers are called when the GC destroys an object, they are
very limited in what they can do. They can't assume any GC managed
object they have a reference to is valid, etc. By contrast, destructors
can make this assumption, because the object is being destroyed
deterministically. I think having both may be too confusing to be
class LinkedList {
~this() { // called deterministically
for( Node n = top; n; ) {
Node t = n->next;
delete n;
n = t;
}
finalize();
}
void finalize() { // called by GC
// nodes may have already been destroyed
// so leave them alone, but special
// resources could be reclaimed
}
}
The argument against finalizers, as Mike mentioned, is that you
typically want to reclaim such special resources deterministically, so
letting the GC take care of this 'someday' is of questionable utility.
Sean
Ok, I think we can tackle this problem in a better way. So far, people
have been thinking about the fact that when destructors are called in a
GC cycle, they are called with finalizer semantics (i.e., you don't know
if the member references are valid or not, thus you can't use them).

This is a problem when in a destructor, one would like to destroy
component objects (as the Nodes of the LinkedList example).


Some ideas where discussed here, but I didn't think any were fruitful. Like:
*Forcing all classes with destructors to be auto classes -> doesn't
add any usefulness, instead just nuisances.
*Making the GC destroy objects in an order that makes members
references valid -> has a high performance cost and/or is probably just
not possible (circular references?).


Perhaps another way would be to have the following behavior:
- When a destructor is called during a GC (i.e., "as a finalizer") for
an object, then the member references are not valid and cannot be
referenced, *but they can be deleted*. It will be deleted iff it has not
been deleted already.
I think this can be done without significant overhead. At the end of a
GC cycle, the GC has already a list of all objects that are to be
deleted. Thus, on the release phase, it could be modified to have a flag
indicating whether the object was already deleted or not. Thus when
LinkedList deletes a Node, the delete is only made if the object has
already been deleted or not.


Still, while the previous idea might be good, it's not the optimal,
because we are not clearly apperceiving the problem/issue at hand. What
we *really* want is to directly couple the lifecycle of a component
(member) object with it's composite (owner) object. A Node of a
LinkedList has the same lifecycle of it's LinkedList, so Node shouldn't
even be a independent Garbage Collection managing element.

What we want is an allocator that allocates memory that is not to be
claimed by the GC (but which is to be scanned by the GC). It's behavior
is exactly like the allocator of
http://www.digitalmars.com/d/memory.html#newdelete but it should come
with the language and be available for all types. With usage like:

class LinkedList {
...
Add(Object obj) {
Node node = mnew Node(blabla);
...
}

Thus, when the destructor is called upon a LinkedList, either
explicitly, or by the GC, the Node references will always be valid. One
has to be careful now, as mnew'ed object are effectively under manual
memory management, and so every mnew must have a corresponding delete,
lest there be dangling pointer ou memory leaks. Nonetheless it seems to
be only sane solution to this problem.


Another interesting addition, is to extend the concept of auto to class
members. Just as currently auto couples the lifecycle of a variable to
the enclosing function, an auto class member would couple the lifecycle
of its member to it's owner object. It would get deleted implicitly when
then owner object got deleted. Here is another (made up) example:

class SomeUIWidget {
auto Color fgcolor;
auto Color bgcolor;
auto Size size;
auto Image image;
...

The auto members would then have to be initialized on a constructor or
something (the exact restrictions might vary, such as being final or not).
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
kris
2006-04-09 20:06:07 UTC
Permalink
Post by Bruno Medeiros
Post by Sean Kelly
Post by Jarrett Billingsley
"Sean Kelly" <sean at f4.ca> wrote in message
Post by Sean Kelly
- a type can have a destructor and/or a finalizer
- the destructor is called upon a) explicit delete or b) at end
of scope for auto objects
- the finalizer is called if allocated on the gc heap and the
destructor has not been called
Would you mind explaining why exactly there needs to be a difference
between destructors and finalizers? I've been following all the
arguments about this heap vs. auto classes and dtors vs. finalizers,
and I still can't figure out why destructors _can't be the
finalizers_. Do finalizers do something fundamentally different from
destructors?
Since finalizers are called when the GC destroys an object, they are
very limited in what they can do. They can't assume any GC managed
object they have a reference to is valid, etc. By contrast,
destructors can make this assumption, because the object is being
destroyed deterministically. I think having both may be too confusing
class LinkedList {
~this() { // called deterministically
for( Node n = top; n; ) {
Node t = n->next;
delete n;
n = t;
}
finalize();
}
void finalize() { // called by GC
// nodes may have already been destroyed
// so leave them alone, but special
// resources could be reclaimed
}
}
The argument against finalizers, as Mike mentioned, is that you
typically want to reclaim such special resources deterministically, so
letting the GC take care of this 'someday' is of questionable utility.
Sean
Ok, I think we can tackle this problem in a better way. So far, people
have been thinking about the fact that when destructors are called in a
GC cycle, they are called with finalizer semantics (i.e., you don't know
if the member references are valid or not, thus you can't use them).
This is a problem when in a destructor, one would like to destroy
component objects (as the Nodes of the LinkedList example).
*Forcing all classes with destructors to be auto classes -> doesn't add
any usefulness, instead just nuisances.
*Making the GC destroy objects in an order that makes members
references valid -> has a high performance cost and/or is probably just
not possible (circular references?).
- When a destructor is called during a GC (i.e., "as a finalizer") for
an object, then the member references are not valid and cannot be
referenced, *but they can be deleted*. It will be deleted iff it has not
been deleted already.
I think this can be done without significant overhead. At the end of a
GC cycle, the GC has already a list of all objects that are to be
deleted. Thus, on the release phase, it could be modified to have a flag
indicating whether the object was already deleted or not. Thus when
LinkedList deletes a Node, the delete is only made if the object has
already been deleted or not.
Still, while the previous idea might be good, it's not the optimal,
because we are not clearly apperceiving the problem/issue at hand. What
we *really* want is to directly couple the lifecycle of a component
(member) object with it's composite (owner) object. A Node of a
LinkedList has the same lifecycle of it's LinkedList, so Node shouldn't
even be a independent Garbage Collection managing element.
What we want is an allocator that allocates memory that is not to be
claimed by the GC (but which is to be scanned by the GC). It's behavior
is exactly like the allocator of
http://www.digitalmars.com/d/memory.html#newdelete but it should come
class LinkedList {
...
Add(Object obj) {
Node node = mnew Node(blabla);
...
}
Thus, when the destructor is called upon a LinkedList, either
explicitly, or by the GC, the Node references will always be valid. One
has to be careful now, as mnew'ed object are effectively under manual
memory management, and so every mnew must have a corresponding delete,
lest there be dangling pointer ou memory leaks. Nonetheless it seems to
be only sane solution to this problem.
Another interesting addition, is to extend the concept of auto to class
members. Just as currently auto couples the lifecycle of a variable to
the enclosing function, an auto class member would couple the lifecycle
of its member to it's owner object. It would get deleted implicitly when
class SomeUIWidget {
auto Color fgcolor;
auto Color bgcolor;
auto Size size;
auto Image image;
...
The auto members would then have to be initialized on a constructor or
something (the exact restrictions might vary, such as being final or not).
Regardless of how it's implemented, what's needed is a bit of
consistency. Currently, dtors are invoked with two entirely different
world-states: with valid state, and with unspecified state. What makes
this generally unworkable is the fact that (a) the difference in state
is often critical to the operation of the dtor, and (b) there's no clean
way to tell the difference.

I use a bit of a hack to distinguish between the two: a common module
has a global variable set to true when the enclosing module-dtor is
invoked. This obviously depends upon module-dtors being first (which
they currently are, but that is not in the spec). Most of you will
probably be going "eww" at this point, but it's the only way I found to
make dtors consistent and thus usable. Further, this is only workable if
the dtor() itself can be abandoned when in state (b) above; prohibiting
the use of dtors for a whole class of cleanup concerns, and forcing one
to defer to the dispose() or close() pattern ~~ some say anti-pattern.

As I understand it, the two states correspond to (1) an explicit
'delete' of the object, which includes "auto" usage; and (2) implicit
cleanup via the GC.

The suggestion to restrict dtor to 'auto' classes is a means to limit
dtors to #1 above; thus at least making them consistent. However, there
are common cases that #1 does not allow for (I'm thinking specifically
of object lifetimes that are not related to scope ~ such as time-based).
That would need to be addressed somehow?

Turning to your suggestions ~ the 'marking' of references such that they
can be "deleted" multiple times is perhaps questionable, partly because
it appears to be specific to the GC implementation? I imagine an
incremental collector would have problems with this approach, even if it
were workable with a "stop the world" collector? I don't know for sure,
but suspect there'd be issues there somewhere.

Whatever the resolution, consistency should be the order of the day.

- Kris
kris
2006-04-09 20:51:27 UTC
Permalink
Post by kris
I use a bit of a hack to distinguish between the two: a common module
has a global variable set to true when the enclosing module-dtor is
invoked. This obviously depends upon module-dtors being first (which
they currently are, but that is not in the spec). Most of you will
probably be going "eww" at this point, but it's the only way I found to
make dtors consistent and thus usable. Further, this is only workable if
the dtor() itself can be abandoned when in state (b) above; prohibiting
the use of dtors for a whole class of cleanup concerns, and forcing one
to defer to the dispose() or close() pattern ~~ some say anti-pattern.
After reading, that paragraph does not reflect the status-quo at all ...

First, it should have said "used" instead of "use" (past-tense ~ this is
not applied any more, since dtors have all but been abandoned). Second,
the identification of "state" was limited to program termination only ~
the classes in question were actually collected only at that point.
Third, the cleanup did not rely on GC managed memory. All in all, that
paragraph is pretty darned misleading ~~ my bad :-(

The take-home message is that I did not find a general mechanism to
distinguish between valid-state and unspecified-state for a dtor ~ the
oft-crucial inconsistency remains in its fully-fledged guise. The other
issue is that I clearly should avoid posting whilst hallucinating.

Sorry;
Bruno Medeiros
2006-04-10 12:16:31 UTC
Permalink
Post by kris
Regardless of how it's implemented, what's needed is a bit of
consistency. Currently, dtors are invoked with two entirely different
world-states: with valid state, and with unspecified state. What makes
this generally unworkable is the fact that (a) the difference in state
is often critical to the operation of the dtor, and (b) there's no clean
way to tell the difference.
Hum, from what you said, follows a rather trivial alternative solution
to the problem: Have the destructor have an implicit parameter/variable,
that indicates whether it was called explicitly or as a finalizer (i.e,
in a GC run): (This would be similar in semantics to Sean's suggestion
of separating the destruction and finalize methods)

class LinkedList {
~this() { // called manually/explicitly and automatically
if(explicit) {
for( Node n = top; n; ) {
Node t = n->next;
delete n;
n = t;
}
}
// ... finalize here
}
...

Would this be acceptable? How would this compare to other suggestions? I
can think of a few things to say versus my suggestion.
Post by kris
As I understand it, the two states correspond to (1) an explicit
'delete' of the object, which includes "auto" usage; and (2) implicit
cleanup via the GC.
The suggestion to restrict dtor to 'auto' classes is a means to limit
dtors to #1 above; thus at least making them consistent. However, there
are common cases that #1 does not allow for (I'm thinking specifically
of object lifetimes that are not related to scope ~ such as time-based).
That would need to be addressed somehow?
Turning to your suggestions ~ the 'marking' of references such that they
can be "deleted" multiple times is perhaps questionable, partly because
it appears to be specific to the GC implementation? I imagine an
incremental collector would have problems with this approach, even if it
were workable with a "stop the world" collector? I don't know for sure,
but suspect there'd be issues there somewhere.
It works for a stop-the-world collector, I'm sure. As for a incremental
collector, hum... well, it works if collector guarantees the following:
* The collector determines a set S of objects to be reclaimed, and no
object in S is referenced outside of S.
Post by kris
Whatever the resolution, consistency should be the order of the day.
- Kris
Manual and automatic memory management are two very different paradigms
that are likely impossible or impractical to be made "consistent" or
conciliated, at least in the way you are implying. The "only auto
classes have destructors" suggestion only makes it "consistent" because
it limits the usage of the class to only one paradigm (manual management).
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
kris
2006-04-10 19:01:49 UTC
Permalink
Post by Bruno Medeiros
Post by kris
Regardless of how it's implemented, what's needed is a bit of
consistency. Currently, dtors are invoked with two entirely different
world-states: with valid state, and with unspecified state. What makes
this generally unworkable is the fact that (a) the difference in state
is often critical to the operation of the dtor, and (b) there's no
clean way to tell the difference.
Hum, from what you said, follows a rather trivial alternative solution
to the problem: Have the destructor have an implicit parameter/variable,
that indicates whether it was called explicitly or as a finalizer (i.e,
in a GC run): (This would be similar in semantics to Sean's suggestion
of separating the destruction and finalize methods)
class LinkedList {
~this() { // called manually/explicitly and automatically
if(explicit) {
for( Node n = top; n; ) {
Node t = n->next;
delete n;
n = t;
}
}
// ... finalize here
}
...
Would this be acceptable? How would this compare to other suggestions? I
can think of a few things to say versus my suggestion.
Perhaps it would be better as an optional parameter? This certainly
would allow for lazy dtors that don't need timely cleanup. Although I
can't think of any reasonable examples to illustrate with.

However, it clearly exposes the "uneasy" status that a dtor might find
itself in. For that reason it seems a bit like a hack on top of a queasy
problem (to me). In cases like these I tend to think it's better to
start off constrained and deterministic (remove those 'lazy'
non-deterministic dtor invocations), and then optionally open things up
as is deemed necessary, or when a resolution to the non-determinism is
found.
Sean Kelly
2006-04-09 21:31:25 UTC
Permalink
Post by Bruno Medeiros
What we want is an allocator that allocates memory that is not to be
claimed by the GC (but which is to be scanned by the GC). It's behavior
is exactly like the allocator of
http://www.digitalmars.com/d/memory.html#newdelete but it should come
class LinkedList {
...
Add(Object obj) {
Node node = mnew Node(blabla);
...
}
Thus, when the destructor is called upon a LinkedList, either
explicitly, or by the GC, the Node references will always be valid. One
has to be careful now, as mnew'ed object are effectively under manual
memory management, and so every mnew must have a corresponding delete,
lest there be dangling pointer ou memory leaks. Nonetheless it seems to
be only sane solution to this problem.
This does seem to be the most reasonable method. In fact, it could be
done now without the addition of new keywords by adding two new GC
functions: release and reclaim (bad names, but they're all I could think
of). 'release' would tell the GC not to automatically finalize or
delete the memory block, as you've suggested above, and 'reclaim' would
transfer ownership back to the GC. It's more error prone than I'd like,
but also perhaps the most reasonable.

A possible alternative would be for the GC to peform its cleanup in two
stages. The first sweep runs all finalizers on orphaned objects, and
the second releases the memory. Thus in Eric's example on d.D.learn, he
would be able legally iterate across his AA and close all HANDLEs
because the memory would still be valid at that stage.

Assuming there aren't any problems with this latter idea, I think it
should be implemented as standard behavior for the GC, and the former
idea should be provided as an option. Thus the user would have complete
manual control available when needed, but more foolproof basic behavior
for simpler situations.
Post by Bruno Medeiros
Another interesting addition, is to extend the concept of auto to class
members. Just as currently auto couples the lifecycle of a variable to
the enclosing function, an auto class member would couple the lifecycle
of its member to it's owner object. It would get deleted implicitly when
class SomeUIWidget {
auto Color fgcolor;
auto Color bgcolor;
auto Size size;
auto Image image;
...
The auto members would then have to be initialized on a constructor or
something (the exact restrictions might vary, such as being final or not).
I like this idea as well, though it may require some additional
bookkeeping to accomplish. For example, a GC scan may encounter the
members before the owner, so each member may need to contain a hidden
pointer to the owner object so the GC knows how to sort things out.


Sean
Georg Wrede
2006-04-09 23:23:58 UTC
Permalink
Post by Sean Kelly
Post by Bruno Medeiros
What we want is an allocator that allocates memory that is not to be
claimed by the GC (but which is to be scanned by the GC). It's
behavior is exactly like the allocator of
http://www.digitalmars.com/d/memory.html#newdelete but it should come
class LinkedList {
...
Add(Object obj) {
Node node = mnew Node(blabla);
...
}
Thus, when the destructor is called upon a LinkedList, either
explicitly, or by the GC, the Node references will always be valid.
One has to be careful now, as mnew'ed object are effectively under
manual memory management, and so every mnew must have a corresponding
delete, lest there be dangling pointer ou memory leaks. Nonetheless it
seems to be only sane solution to this problem.
This does seem to be the most reasonable method. In fact, it could be
done now without the addition of new keywords by adding two new GC
functions: release and reclaim (bad names, but they're all I could think
of). 'release' would tell the GC not to automatically finalize or
delete the memory block, as you've suggested above, and 'reclaim' would
transfer ownership back to the GC. It's more error prone than I'd like,
but also perhaps the most reasonable.
A possible alternative would be for the GC to peform its cleanup in two
stages. The first sweep runs all finalizers on orphaned objects, and
the second releases the memory. Thus in Eric's example on d.D.learn, he
would be able legally iterate across his AA and close all HANDLEs
because the memory would still be valid at that stage.
Assuming there aren't any problems with this latter idea, I think it
should be implemented as standard behavior for the GC, and the former
idea should be provided as an option. Thus the user would have complete
manual control available when needed, but more foolproof basic behavior
for simpler situations.
Post by Bruno Medeiros
Another interesting addition, is to extend the concept of auto to
class members. Just as currently auto couples the lifecycle of a
variable to the enclosing function, an auto class member would couple
the lifecycle of its member to it's owner object. It would get deleted
implicitly when then owner object got deleted. Here is another (made
class SomeUIWidget {
auto Color fgcolor;
auto Color bgcolor;
auto Size size;
auto Image image;
...
The auto members would then have to be initialized on a constructor or
something (the exact restrictions might vary, such as being final or not).
I like this idea as well, though it may require some additional
bookkeeping to accomplish. For example, a GC scan may encounter the
members before the owner, so each member may need to contain a hidden
pointer to the owner object so the GC knows how to sort things out.
If the above case was written as:

class SomeUIWidget {
Color fgcolor;
Color bgcolor;
Size size;
Image image;
...

and the class didn't have an explicit destructor, then the only "damage"
at GC (or otherwise destruction) time would be that a couple of Color
instances, a Size instance and an Image instance would be "left over"
after that particular GC run.

Big deal? At the next GC run (unless they'd be pointed-to by other
things), they'd get deleted too. No major flood of tears here.

Somehow I fear folks are making this a way too complicated thing.
Bruno Medeiros
2006-04-10 11:33:33 UTC
Permalink
Post by Bruno Medeiros
class SomeUIWidget {
Color fgcolor;
Color bgcolor;
Size size;
Image image;
...
and the class didn't have an explicit destructor, then the only "damage"
at GC (or otherwise destruction) time would be that a couple of Color
instances, a Size instance and an Image instance would be "left over"
after that particular GC run.
Big deal? At the next GC run (unless they'd be pointed-to by other
things), they'd get deleted too. No major flood of tears here.
Somehow I fear folks are making this a way too complicated thing.
Actually, with any decent GC, all of those objects will be reclaimed on
the first GC run (and DMD does that). So you are correct that there is
no difference when running the GC on that object.
But you miss the point. The point (of my suggestions) was to be able to
have a destruction system that would work correctly both when called by
a GC cycle, and when called explicitly (outside of a GC cycle).
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Bruno Medeiros
2006-04-10 11:41:28 UTC
Permalink
Post by Bruno Medeiros
Post by Sean Kelly
Post by Bruno Medeiros
What we want is an allocator that allocates memory that is not to be
claimed by the GC (but which is to be scanned by the GC). It's
behavior is exactly like the allocator of
http://www.digitalmars.com/d/memory.html#newdelete but it should come
class LinkedList {
...
Add(Object obj) {
Node node = mnew Node(blabla);
...
}
Thus, when the destructor is called upon a LinkedList, either
explicitly, or by the GC, the Node references will always be valid.
One has to be careful now, as mnew'ed object are effectively under
manual memory management, and so every mnew must have a corresponding
delete, lest there be dangling pointer ou memory leaks. Nonetheless
it seems to be only sane solution to this problem.
This does seem to be the most reasonable method. In fact, it could be
done now without the addition of new keywords by adding two new GC
functions: release and reclaim (bad names, but they're all I could
think of). 'release' would tell the GC not to automatically finalize
or delete the memory block, as you've suggested above, and 'reclaim'
would transfer ownership back to the GC. It's more error prone than
I'd like, but also perhaps the most reasonable.
A possible alternative would be for the GC to peform its cleanup in
two stages. The first sweep runs all finalizers on orphaned objects,
and the second releases the memory. Thus in Eric's example on
d.D.learn, he would be able legally iterate across his AA and close
all HANDLEs because the memory would still be valid at that stage.
Assuming there aren't any problems with this latter idea, I think it
should be implemented as standard behavior for the GC, and the former
idea should be provided as an option. Thus the user would have
complete manual control available when needed, but more foolproof
basic behavior for simpler situations.
Post by Bruno Medeiros
Another interesting addition, is to extend the concept of auto to
class members. Just as currently auto couples the lifecycle of a
variable to the enclosing function, an auto class member would couple
the lifecycle of its member to it's owner object. It would get
deleted implicitly when then owner object got deleted. Here is
class SomeUIWidget {
auto Color fgcolor;
auto Color bgcolor;
auto Size size;
auto Image image;
...
The auto members would then have to be initialized on a constructor
or something (the exact restrictions might vary, such as being final
or not).
I like this idea as well, though it may require some additional
bookkeeping to accomplish. For example, a GC scan may encounter the
members before the owner, so each member may need to contain a hidden
pointer to the owner object so the GC knows how to sort things out.
class SomeUIWidget {
Color fgcolor;
Color bgcolor;
Size size;
Image image;
...
and the class didn't have an explicit destructor, then the only "damage"
at GC (or otherwise destruction) time would be that a couple of Color
instances, a Size instance and an Image instance would be "left over"
after that particular GC run.
Big deal? At the next GC run (unless they'd be pointed-to by other
things), they'd get deleted too. No major flood of tears here.
Somehow I fear folks are making this a way too complicated thing.
Actually, with any decent GC, all of those objects will be reclaimed on
the first GC run (and DMD does that). So you are correct that there is
no difference when running the GC on that object.
But you miss the point. The point (of my suggestions) was to be able to
have a destruction system that would work "correctly/extensively" both
when called by a GC cycle, and when called explicitly (outside of a GC
cycle).
By "correctly/extensively" I mean that the destructor would be able in
both cases to ensure the destruction of it's owned resources.
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Bruno Medeiros
2006-04-10 10:58:30 UTC
Permalink
Post by Sean Kelly
Post by Bruno Medeiros
What we want is an allocator that allocates memory that is not to be
claimed by the GC (but which is to be scanned by the GC). It's
behavior is exactly like the allocator of
http://www.digitalmars.com/d/memory.html#newdelete but it should come
class LinkedList {
...
Add(Object obj) {
Node node = mnew Node(blabla);
...
}
Thus, when the destructor is called upon a LinkedList, either
explicitly, or by the GC, the Node references will always be valid.
One has to be careful now, as mnew'ed object are effectively under
manual memory management, and so every mnew must have a corresponding
delete, lest there be dangling pointer ou memory leaks. Nonetheless it
seems to be only sane solution to this problem.
This does seem to be the most reasonable method. In fact, it could be
done now without the addition of new keywords by adding two new GC
functions: release and reclaim (bad names, but they're all I could think
of). 'release' would tell the GC not to automatically finalize or
delete the memory block, as you've suggested above, and 'reclaim' would
transfer ownership back to the GC. It's more error prone than I'd like,
but also perhaps the most reasonable.
Hum, indeed.
Post by Sean Kelly
A possible alternative would be for the GC to peform its cleanup in two
stages. The first sweep runs all finalizers on orphaned objects, and
the second releases the memory. Thus in Eric's example on d.D.learn, he
would be able legally iterate across his AA and close all HANDLEs
because the memory would still be valid at that stage.
By orphaned objects, do you mean all objects that are to be reclaimed by
the GC on that cycle? Or just the subset of those objects, that are not
referenced by anyone?
Post by Sean Kelly
Assuming there aren't any problems with this latter idea, I think it
should be implemented as standard behavior for the GC, and the former
idea should be provided as an option. Thus the user would have complete
manual control available when needed, but more foolproof basic behavior
for simpler situations.
Post by Bruno Medeiros
Another interesting addition, is to extend the concept of auto to
class members. Just as currently auto couples the lifecycle of a
variable to the enclosing function, an auto class member would couple
the lifecycle of its member to it's owner object. It would get deleted
implicitly when then owner object got deleted. Here is another (made
class SomeUIWidget {
auto Color fgcolor;
auto Color bgcolor;
auto Size size;
auto Image image;
...
The auto members would then have to be initialized on a constructor or
something (the exact restrictions might vary, such as being final or not).
I like this idea as well, though it may require some additional
bookkeeping to accomplish. For example, a GC scan may encounter the
members before the owner, so each member may need to contain a hidden
pointer to the owner object so the GC knows how to sort things out.
Sean
Hum, true, it would need some additional bookkeeping, didn't realize
that immediately. The semantics like those that I mentioned in me
previous post would suffice:

"When a destructor is called upon an object during a GC (i.e., "as a
finalizer"), then the member references are not valid and cannot be
referenced, *but they can be deleted*. Each will be deleted iff it has
not been deleted already in the reclaiming phase."

I don't think your algorithm (having a hidden pointer) would be
necessary (or even feasible), and the one I mentioned before would suffice.
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Sean Kelly
2006-04-10 16:24:15 UTC
Permalink
Post by Bruno Medeiros
Post by Sean Kelly
A possible alternative would be for the GC to peform its cleanup in
two stages. The first sweep runs all finalizers on orphaned objects,
and the second releases the memory. Thus in Eric's example on
d.D.learn, he would be able legally iterate across his AA and close
all HANDLEs because the memory would still be valid at that stage.
By orphaned objects, do you mean all objects that are to be reclaimed by
the GC on that cycle? Or just the subset of those objects, that are not
referenced by anyone?
All objects that are to be reclaimed. I figured your other suggestion
could be used for more complex cases.
Post by Bruno Medeiros
Post by Sean Kelly
Post by Bruno Medeiros
Another interesting addition, is to extend the concept of auto to
class members. Just as currently auto couples the lifecycle of a
variable to the enclosing function, an auto class member would couple
the lifecycle of its member to it's owner object. It would get
deleted implicitly when then owner object got deleted. Here is
class SomeUIWidget {
auto Color fgcolor;
auto Color bgcolor;
auto Size size;
auto Image image;
...
The auto members would then have to be initialized on a constructor
or something (the exact restrictions might vary, such as being final
or not).
I like this idea as well, though it may require some additional
bookkeeping to accomplish. For example, a GC scan may encounter the
members before the owner, so each member may need to contain a hidden
pointer to the owner object so the GC knows how to sort things out.
Hum, true, it would need some additional bookkeeping, didn't realize
that immediately. The semantics like those that I mentioned in me
"When a destructor is called upon an object during a GC (i.e., "as a
finalizer"), then the member references are not valid and cannot be
referenced, *but they can be deleted*. Each will be deleted iff it has
not been deleted already in the reclaiming phase."
I don't think your algorithm (having a hidden pointer) would be
necessary (or even feasible), and the one I mentioned before would suffice.
Hrm... but what if the owner is simply collected via a normal GC run?
In that case, the GC may encounter the member objects before the owner
object. I suppose bookkeeping at the member level may not be necessary,
but it may result in an extra scan through the list of objects to be
finalized to determine who owns what.


Sean
Bruno Medeiros
2006-04-13 18:53:30 UTC
Permalink
Post by Sean Kelly
Post by Bruno Medeiros
Post by Sean Kelly
A possible alternative would be for the GC to peform its cleanup in
two stages. The first sweep runs all finalizers on orphaned objects,
and the second releases the memory. Thus in Eric's example on
d.D.learn, he would be able legally iterate across his AA and close
all HANDLEs because the memory would still be valid at that stage.
By orphaned objects, do you mean all objects that are to be reclaimed
by the GC on that cycle? Or just the subset of those objects, that are
not referenced by anyone?
All objects that are to be reclaimed. I figured your other suggestion
could be used for more complex cases.
That way, you have the guarantee that all references are valid, but some
instances would have their destructors called multiple times. That's
likely a behavior that isn't acceptable on some cases.
Post by Sean Kelly
Post by Bruno Medeiros
Post by Sean Kelly
Post by Bruno Medeiros
Another interesting addition, is to extend the concept of auto to
class members. Just as currently auto couples the lifecycle of a
variable to the enclosing function, an auto class member would
couple the lifecycle of its member to it's owner object. It would
get deleted implicitly when then owner object got deleted. Here is
class SomeUIWidget {
auto Color fgcolor;
auto Color bgcolor;
auto Size size;
auto Image image;
...
The auto members would then have to be initialized on a constructor
or something (the exact restrictions might vary, such as being final
or not).
I like this idea as well, though it may require some additional
bookkeeping to accomplish. For example, a GC scan may encounter the
members before the owner, so each member may need to contain a hidden
pointer to the owner object so the GC knows how to sort things out.
Hum, true, it would need some additional bookkeeping, didn't realize
that immediately. The semantics like those that I mentioned in me
"When a destructor is called upon an object during a GC (i.e., "as a
finalizer"), then the member references are not valid and cannot be
referenced, *but they can be deleted*. Each will be deleted iff it has
not been deleted already in the reclaiming phase."
I don't think your algorithm (having a hidden pointer) would be
necessary (or even feasible), and the one I mentioned before would suffice.
Hrm... but what if the owner is simply collected via a normal GC run? In
that case, the GC may encounter the member objects before the owner
object. I suppose bookkeeping at the member level may not be necessary,
but it may result in an extra scan through the list of objects to be
finalized to determine who owns what.
Sean
The bookkeeping is made by the GC and memory pool manager. A scan
through the list of objects to be finalized is necessary, but it won't
be an _extra_ scan. Let me try to explain this way:

*** The current GC algorithm: ***

delete obj:

m = getMemManagerHandle(obj);
if(m.isObjectInstance)
m.obj.destroy(); // calls ~this()
freeMemory(m);

GC:

GC determines a set S of instances to be reclaimed (garbage);
foreach(m in S) {
delete m;
}

*** The extended GC algorithm: ***

delete:

m = getMemManagerHandle(obj);
if(m.isDeleted)
return;
if(m.isObjectInstance)
m.obj.destroy(); // calls ~this()
if(!m.isGarbageSet) // If it is not in S
freeMemory(m);

GC:

GC determines a set S of instances to be reclaimed (garbage);
foreach(m in S) {
m.isGarbage = true;
}
foreach(m in S) {
delete m;
}
foreach(m in S) {
freeMemory(m);
}


And there we go. No increase in algorithmic complexity. There is only an
increase in the Memory Manager record size (we need a flag for
m.isDeleted, and we need it only during a GC run).
The reason we don't freeMemory(m) right after delete m; is because we
need the bookkeeping of m.isDeleted until the end of the GC run.
The reason we have m.isGarbage is to allow the deletion of objects not
in S during the GC run. (it is an optimization of doing "S.contains(m)" )

Hope I don't have a bug up there :P
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
pragma
2006-04-13 19:31:55 UTC
Permalink
In article <e1m6mo$19d$1 at digitaldaemon.com>, Bruno Medeiros says...
Post by Bruno Medeiros
*** The extended GC algorithm: ***
m = getMemManagerHandle(obj);
if(m.isDeleted)
return;
if(m.isObjectInstance)
m.obj.destroy(); // calls ~this()
if(!m.isGarbageSet) // If it is not in S
freeMemory(m);
GC determines a set S of instances to be reclaimed (garbage);
foreach(m in S) {
m.isGarbage = true;
}
foreach(m in S) {
delete m;
}
foreach(m in S) {
freeMemory(m);
}
Something like this will help *part* of the problem. By delaying the freeing of
referenced memory, dynamically allocated primitives (like arrays) will continue
to function inside of class destructors. However, this does not help with
references to objects and structs, as they may still be placed in an invalid
state by their own destructors.

/**/ class A{
/**/ public uint resource;
/**/ public this(){ resource = 42; }
/**/ public ~this(){ resource = 0; }
/**/ }
/**/ class B{
/**/ public A a;
/**/ public this(){ a = new A(); }
/**/ public ~this(){ writefln("resource: %d",a.resource); }
/**/ }

Depending on the ording in S, the program will output either "resource: 42" or
"resource: 0". The problem only gets worse for object cycles. I'm not saying
it won't work, but it just moves the wrinkle into a different area to be stomped
out.

Now, one way to improve this is if there were a standard method on objects that
can be checked in situations like these. That way you'd know if another object
is finalized, or in the process of being finalized.
Post by Bruno Medeiros
foreach(m in S) {
m.isFinalized = true;
delete m;
}
Now this doesn't make life any easier, but it does make things deterministic.

/**/ class A{
/**/ public uint resource;
/**/ public this(){ resource = 42; }
/**/ public ~this(){ resource = 0; }
/**/ }
/**/ class B{
/**/ public A a;
/**/ public this(){ a = new A(); }
/**/ public ~this(){ if(!a.isFinalized) writefln("resource: %d",a.resource); }
/**/ }

(another option would be something like gc.isFinalized(a), should the footprint
of Object be an issue)

Now B outputs nothing if A is finalized. That seems like a win, but what if B
really needed that value before A went away? In such a case, you're back to
square-one: you can't depend on the state of another referenced object within a
dtor, valid reference or otherwise.

- EricAnderton at yahoo
Bruno Medeiros
2006-04-14 15:44:28 UTC
Permalink
Post by pragma
In article <e1m6mo$19d$1 at digitaldaemon.com>, Bruno Medeiros says...
Post by Bruno Medeiros
*** The extended GC algorithm: ***
m = getMemManagerHandle(obj);
if(m.isDeleted)
return;
if(m.isObjectInstance)
m.obj.destroy(); // calls ~this()
if(!m.isGarbageSet) // If it is not in S
freeMemory(m);
GC determines a set S of instances to be reclaimed (garbage);
foreach(m in S) {
m.isGarbage = true;
}
foreach(m in S) {
delete m;
}
foreach(m in S) {
freeMemory(m);
}
Something like this will help *part* of the problem. By delaying the freeing of
referenced memory, dynamically allocated primitives (like arrays) will continue
to function inside of class destructors. However, this does not help with
references to objects and structs, as they may still be placed in an invalid
state by their own destructors.
/**/ class A{
/**/ public uint resource;
/**/ public this(){ resource = 42; }
/**/ public ~this(){ resource = 0; }
/**/ }
/**/ class B{
/**/ public A a;
/**/ public this(){ a = new A(); }
/**/ public ~this(){ writefln("resource: %d",a.resource); }
/**/ }
Depending on the ording in S, the program will output either "resource: 42" or
"resource: 0". The problem only gets worse for object cycles. I'm not saying
it won't work, but it just moves the wrinkle into a different area to be stomped
out.
True, I forgot to mention that. The order of destruction is undefined,
so it will only work with objects where that order doesn't matter. (that
should be the case with most)
Post by pragma
Now, one way to improve this is if there were a standard method on objects that
can be checked in situations like these. That way you'd know if another object
is finalized, or in the process of being finalized.
Post by Bruno Medeiros
foreach(m in S) {
m.isFinalized = true;
delete m;
}
Now this doesn't make life any easier, but it does make things deterministic.
/**/ class A{
/**/ public uint resource;
/**/ public this(){ resource = 42; }
/**/ public ~this(){ resource = 0; }
/**/ }
/**/ class B{
/**/ public A a;
/**/ public this(){ a = new A(); }
/**/ public ~this(){ if(!a.isFinalized) writefln("resource: %d",a.resource); }
/**/ }
(another option would be something like gc.isFinalized(a), should the footprint
of Object be an issue)
Now B outputs nothing if A is finalized. That seems like a win, but what if B
really needed that value before A went away? In such a case, you're back to
square-one: you can't depend on the state of another referenced object within a
dtor, valid reference or otherwise.
- EricAnderton at yahoo
Exactly, you can't really solve the order/state problem with this. I
think the only way to do it is to manually memory manage the member
objects (with a construct such as mmnew or otherwise).
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Bruno Medeiros
2006-04-14 15:50:03 UTC
Permalink
Post by Bruno Medeiros
Post by Sean Kelly
Post by Bruno Medeiros
What we want is an allocator that allocates memory that is not to be
claimed by the GC (but which is to be scanned by the GC). It's
behavior is exactly like the allocator of
http://www.digitalmars.com/d/memory.html#newdelete but it should come
class LinkedList {
...
Add(Object obj) {
Node node = mnew Node(blabla);
...
}
Thus, when the destructor is called upon a LinkedList, either
explicitly, or by the GC, the Node references will always be valid.
One has to be careful now, as mnew'ed object are effectively under
manual memory management, and so every mnew must have a corresponding
delete, lest there be dangling pointer ou memory leaks. Nonetheless
it seems to be only sane solution to this problem.
This does seem to be the most reasonable method. In fact, it could be
done now without the addition of new keywords by adding two new GC
functions: release and reclaim (bad names, but they're all I could
think of). 'release' would tell the GC not to automatically finalize
or delete the memory block, as you've suggested above, and 'reclaim'
would transfer ownership back to the GC. It's more error prone than
I'd like, but also perhaps the most reasonable.
Hum, indeed.
Then again, with a proper allocator (mmnew) there is room for more
optimization. I doubt one would want (or that it would be good) to
change the management ownership of an instance during it's lifetime.
Rather, it should be set right from the start (when allocated).

Also, I've realized just now, that with templates one can get a pretty
close solution, with something like:
mmnew!(Foobar)
The shortcoming is you won't be able to use non-default constructors in
that call.
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Mike Capp
2006-04-09 21:50:44 UTC
Permalink
In article <e1bj4r$1gt$1 at digitaldaemon.com>, Bruno Medeiros says...
Post by Bruno Medeiros
*Forcing all classes with destructors to be auto classes -> doesn't
add any usefulness, instead just nuisances.
Hmm, yes. Like private/protected member access specifiers - what usefulness do
they add? Or requiring a cast to assign from one type to another - sheer
nuisance!

cheers
Mike
Bruno Medeiros
2006-04-10 10:05:00 UTC
Permalink
Post by Mike Capp
In article <e1bj4r$1gt$1 at digitaldaemon.com>, Bruno Medeiros says...
Post by Bruno Medeiros
*Forcing all classes with destructors to be auto classes -> doesn't
add any usefulness, instead just nuisances.
Hmm, yes. Like private/protected member access specifiers - what usefulness do
they add? Or requiring a cast to assign from one type to another - sheer
nuisance!
cheers
Mike
Protection attributes and casts add usefulness (not gonna detail why).
Forcing all classes with destructors to be auto classes, on the other
hand, severily limits the usage of such classes. An auto class can not
be a global, static, field, inout and out parameter. It must be bound to
a function, and *cannot be a part of another data structure*. This
latter restriction, as is, is unacceptable, no?
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Mike Capp
2006-04-10 12:46:34 UTC
Permalink
In article <e1dak2$21d9$1 at digitaldaemon.com>, Bruno Medeiros says...
Post by Bruno Medeiros
Protection attributes and casts add usefulness (not gonna detail why).
The usefulness of protection attributes lies solely in preventing you from
misusing something. Same with auto and dtors. If a class needs a dtor, leaving
it to the GC qualifies as misuse in my view.
Post by Bruno Medeiros
Forcing all classes with destructors to be auto classes, on the other
hand, severily limits the usage of such classes. An auto class can not
be a global, static, field, inout and out parameter. It must be bound to
a function, and *cannot be a part of another data structure*. This
latter restriction, as is, is unacceptable, no?
Agreed; IIRC, auto members of auto classes were part of my original suggestion,
and I think the dtors-for-autos-only restriction would quickly force this
problem out into the open.

It may be that we're agreeing on the destination and only differing on how to
get there.

cheers
Mike
Don Clugston
2006-04-10 15:00:20 UTC
Permalink
Post by Mike Capp
In article <e1dak2$21d9$1 at digitaldaemon.com>, Bruno Medeiros says...
Post by Bruno Medeiros
Protection attributes and casts add usefulness (not gonna detail why).
The usefulness of protection attributes lies solely in preventing you from
misusing something. Same with auto and dtors. If a class needs a dtor, leaving
it to the GC qualifies as misuse in my view.
Post by Bruno Medeiros
Forcing all classes with destructors to be auto classes, on the other
hand, severily limits the usage of such classes. An auto class can not
be a global, static, field, inout and out parameter. It must be bound to
a function, and *cannot be a part of another data structure*. This
latter restriction, as is, is unacceptable, no?
Agreed; IIRC, auto members of auto classes were part of my original suggestion,
and I think the dtors-for-autos-only restriction would quickly force this
problem out into the open.
I suspect that if finalisers were abolished, those other restrictions
would be MUCH easier to lift. They probably exist mainly because of the
complexity of the interactions with the GC.
Post by Mike Capp
It may be that we're agreeing on the destination and only differing on how to
get there.
cheers
Mike
Regan Heath
2006-04-10 23:54:42 UTC
Permalink
On Mon, 10 Apr 2006 11:05:00 +0100, Bruno Medeiros
Post by Bruno Medeiros
Post by Mike Capp
In article <e1bj4r$1gt$1 at digitaldaemon.com>, Bruno Medeiros says...
Post by Bruno Medeiros
*Forcing all classes with destructors to be auto classes -> doesn't
add any usefulness, instead just nuisances.
Hmm, yes. Like private/protected member access specifiers - what usefulness do
they add? Or requiring a cast to assign from one type to another - sheer
nuisance!
cheers
Mike
Protection attributes and casts add usefulness (not gonna detail why).
Forcing all classes with destructors to be auto classes, on the other
hand, severily limits the usage of such classes. An auto class can not
be a global, static, field, inout and out parameter. It must be bound to
a function, and *cannot be a part of another data structure*. This
latter restriction, as is, is unacceptable, no?
The suggestion I made assumed we could remove these restrictions. I'm not
sure whether that's true or not, if it is impossible can someone explain
to me why? I would be curious to know. It seems that if C++ can have
classes at module/file scope that have destructors why can't D. I have a
feeling it has something to do with how Walter has implemented it.. but
that _could_ change, if the reasons were strong enough, right?

If we assume (for the purposes of exploring the solution) that the
restrictions can be removed doesn't this idea make a lot of sense?

1. any class/module with a dtor must be 'auto'.
2. any class/module containing a reference to an 'auto' class/module must
be 'auto'.
3. The 'auto' keyword used here; "auto Foo f = new Foo();" is not
required, remove it.
4. A 'shared' keyword is used to indicate a shared 'auto' resource.

Rationale: if a class has cleanup to do, it must be done all the time, not
just sometimes and not selectively*. Therefore any class with cleanup to
do is 'auto' and any class containing an 'auto' member also has cleanup to
do, thus must be 'auto'.

(*) the exception to this rule is a member reference to a shared resource,
thus the 'shared' keyword.

The compiler can auto-generate dtors for classes containing 'auto' members
eg.

auto class File {
HANDLE h;
~this() { CloseHandle(h); }
}

auto class Foo {
File f;
/*auto generated dtor
~this() { delete f; }
}

If the user supplies a dtor, the compiler can simply append it's auto-dtor
to the end of that (I don't think deleting a reference twice is a
problem). In this way 'auto' propagates itself as required.

(In fact if you think about it, the keyword 'auto' isn't really even
required. It can be removed and the behaviour outlined above can simply be
implemented)

The shared keyword would prevent the automatic dtor from calling delete on
the shared reference. If that reference was the only 'auto' member it
would therefore prevent the class from being 'auto' itself. The user would
have to manage the shared resource manually, or rather, can rely on it
being deleted by the (one and only) non-shared reference to it, eg.

[file.d]

File a = new File("a.txt");

class Foo {
shared File foo;
this(File f) { foo = f; }
}

void main()
{
Foo f = new Foo(a);
}

The class 'Foo' is not auto, it has no dtor, the compiler does not
generate one, it's shared reference 'foo' is never deleted. The module
level reference 'a' is auto, an auto generated module level dtor will
delete it.

The classes affected by this idea are few, I'd say less than 20% (even
with 'auto' propagating up the class tree), the rest will have no dtor and
will simply be collected as normal by the GC, no dtor calls required.

As far as I can see there are no restrictions of use for this idea.
Classes will be the same as they are today, only they'll have
deterministic destruction where required. Assuming of course it can
actually be implemented.

Regan
Georg Wrede
2006-04-09 23:12:50 UTC
Permalink
Post by Bruno Medeiros
Post by Sean Kelly
Post by Jarrett Billingsley
"Sean Kelly" <sean at f4.ca> wrote in message
Post by Sean Kelly
- a type can have a destructor and/or a finalizer
- the destructor is called upon a) explicit delete or b) at end
of scope for auto objects
- the finalizer is called if allocated on the gc heap and the
destructor has not been called
Would you mind explaining why exactly there needs to be a difference
between destructors and finalizers? I've been following all the
arguments about this heap vs. auto classes and dtors vs. finalizers,
and I still can't figure out why destructors _can't be the
finalizers_. Do finalizers do something fundamentally different from
destructors?
Since finalizers are called when the GC destroys an object, they are
very limited in what they can do. They can't assume any GC managed
object they have a reference to is valid, etc. By contrast,
destructors can make this assumption, because the object is being
destroyed deterministically. I think having both may be too confusing
class LinkedList {
~this() { // called deterministically
for( Node n = top; n; ) {
Node t = n->next;
delete n;
n = t;
}
finalize();
}
void finalize() { // called by GC
// nodes may have already been destroyed
// so leave them alone, but special
// resources could be reclaimed
}
}
The argument against finalizers, as Mike mentioned, is that you
typically want to reclaim such special resources deterministically, so
letting the GC take care of this 'someday' is of questionable utility.
Sean
Ok, I think we can tackle this problem in a better way. So far, people
have been thinking about the fact that when destructors are called in a
GC cycle, they are called with finalizer semantics (i.e., you don't know
if the member references are valid or not, thus you can't use them).
This is a problem when in a destructor, one would like to destroy
component objects (as the Nodes of the LinkedList example).
*Forcing all classes with destructors to be auto classes -> doesn't add
any usefulness, instead just nuisances.
*Making the GC destroy objects in an order that makes members
references valid -> has a high performance cost and/or is probably just
not possible (circular references?).
- When a destructor is called during a GC (i.e., "as a finalizer") for
an object, then the member references are not valid and cannot be
referenced, *but they can be deleted*. It will be deleted iff it has not
been deleted already.
I think this can be done without significant overhead. At the end of a
GC cycle, the GC has already a list of all objects that are to be
deleted. Thus, on the release phase, it could be modified to have a flag
indicating whether the object was already deleted or not. Thus when
LinkedList deletes a Node, the delete is only made if the object has
already been deleted or not.
If an instance is deleted by the GC, the pointers that it may have to
other instances (of the same or instances of other classes) vanish. All
of those other instances may or may not have other pointers pointing to
them. So, deleting (or destructing) a particular instance, should not in
any way "cascade" to those other instances.

On the next run, the GC _may_ notice that those other instances are not
pointed-to by anything anymore, and then it may delete/destruct them.

---

So much for "regular" instance deletion. Then, we have the case where
the instance "owns" some scarce resource (a file handle, a port, or some
such). Such instances should be destructed in a _timely_ fashion _only_,
right?

In other words, instances that need explicit destruction, should be
destructed _at_the_moment_ they become obsolete -- and not "ma?ana".

It is conceivable that the "regular" instances do not have explicit
destructors (after all, their memory footprint would just be released to
the free pool), wherease the "resource owning" instances really do need
an explicit destructor.

Thus, the existence of an explicit destructor should be a sign that
makes [us, Walter, the compiler, anybody] understand that such an
instance _needs_ to be destructed _right_away_.

This makes one think of "auto". Now, there have been several comments
like /auto can't work/ because we don't know the scope of the instance.
That is just BS. Every piece of source code should be written
"hierarchically" (that is, not the entire program as "one function").
When one refactors the goings-on in the program to short procedures,
then it all of a sudden is not too difficult to use "auto" to manage the
lifetime of instances.
Post by Bruno Medeiros
Still, while the previous idea might be good, it's not the optimal,
because we are not clearly apperceiving the problem/issue at hand. What
we *really* want is to directly couple the lifecycle of a component
(member) object with it's composite (owner) object. A Node of a
LinkedList has the same lifecycle of it's LinkedList, so Node shouldn't
even be a independent Garbage Collection managing element.
What we want is an allocator that allocates memory that is not to be
claimed by the GC (but which is to be scanned by the GC). It's behavior
is exactly like the allocator of
http://www.digitalmars.com/d/memory.html#newdelete but it should come
class LinkedList {
...
Add(Object obj) {
Node node = mnew Node(blabla);
...
}
Thus, when the destructor is called upon a LinkedList, either
explicitly, or by the GC, the Node references will always be valid. One
has to be careful now, as mnew'ed object are effectively under manual
memory management, and so every mnew must have a corresponding delete,
lest there be dangling pointer ou memory leaks. Nonetheless it seems to
be only sane solution to this problem.
Another interesting addition, is to extend the concept of auto to class
members. Just as currently auto couples the lifecycle of a variable to
the enclosing function, an auto class member would couple the lifecycle
of its member to it's owner object. It would get deleted implicitly when
class SomeUIWidget {
auto Color fgcolor;
auto Color bgcolor;
auto Size size;
auto Image image;
...
The auto members would then have to be initialized on a constructor or
something (the exact restrictions might vary, such as being final or not).
kris
2006-04-09 23:56:42 UTC
Permalink
Georg Wrede wrote:
[snip]
Post by Georg Wrede
So much for "regular" instance deletion. Then, we have the case where
the instance "owns" some scarce resource (a file handle, a port, or some
such). Such instances should be destructed in a _timely_ fashion _only_,
right?
In other words, instances that need explicit destruction, should be
destructed _at_the_moment_ they become obsolete -- and not "ma?ana".
It is conceivable that the "regular" instances do not have explicit
destructors (after all, their memory footprint would just be released to
the free pool), wherease the "resource owning" instances really do need
an explicit destructor.
Thus, the existence of an explicit destructor should be a sign that
makes [us, Walter, the compiler, anybody] understand that such an
instance _needs_ to be destructed _right_away_.
This makes one think of "auto". Now, there have been several comments
like /auto can't work/ because we don't know the scope of the instance.
That is just BS. Every piece of source code should be written
"hierarchically" (that is, not the entire program as "one function").
When one refactors the goings-on in the program to short procedures,
then it all of a sudden is not too difficult to use "auto" to manage the
lifetime of instances.
That was all sounding reasonable up until this point :)

I think we can safely put aside the entire-program-as-one-function as
unrealistic. Given that, and assuming the existance of a dtor implies
"auto" (and thus raii), how does one manage a "pool" of resources? For
example, how about a pool of DB connections? Let's assume that they need
to be correctly closed at some point, and that the pool is likely to
expand and contract based upon demand over time ...

So the question is how do those connections, and the pool itself, jive
with scoped raii? Assuming it doesn't, then one would presumeably revert
to a manual dispose() pattern with such things?
Mike Capp
2006-04-10 00:46:36 UTC
Permalink
In article <e1c6vl$moj$1 at digitaldaemon.com>, kris says...
Post by kris
I think we can safely put aside the entire-program-as-one-function as
unrealistic. Given that, and assuming the existance of a dtor implies
"auto" (and thus raii), how does one manage a "pool" of resources? For
example, how about a pool of DB connections? Let's assume that they need
to be correctly closed at some point, and that the pool is likely to
expand and contract based upon demand over time ...
So the question is how do those connections, and the pool itself, jive
with scoped raii? Assuming it doesn't, then one would presumeably revert
to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(),
and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts
as a factory for ConnectionUsage instances (modulo language limitations) and
adds to the pool as needed; ConnectionUsage just "borrows" an instance from the
pool for the duration of its scope.

cheers
Mike
kris
2006-04-10 01:34:47 UTC
Permalink
Post by Mike Capp
In article <e1c6vl$moj$1 at digitaldaemon.com>, kris says...
Post by kris
I think we can safely put aside the entire-program-as-one-function as
unrealistic. Given that, and assuming the existance of a dtor implies
"auto" (and thus raii), how does one manage a "pool" of resources? For
example, how about a pool of DB connections? Let's assume that they need
to be correctly closed at some point, and that the pool is likely to
expand and contract based upon demand over time ...
So the question is how do those connections, and the pool itself, jive
with scoped raii? Assuming it doesn't, then one would presumeably revert
to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(),
and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts
as a factory for ConnectionUsage instances (modulo language limitations) and
adds to the pool as needed; ConnectionUsage just "borrows" an instance from the
pool for the duration of its scope.
cheers
Mike
Thanks!

So, when culling the pool (say, on a timeout basis) the cleanup-code for
the held resource is not held within the "borrowed" dtor, but in a
dispose() method? Otherwise, said dtor would imply raii for the borrowed
connection, which would be bogus behaviour for a class instance that is
being held onto by the pool? In other words: you'd want to avoid
deleting (via raii) the connection object, so you'd have to be careful
to not use a dtor in such a case (if we assume dtor means raii).

What I'm getting at here is a potential complexity in the implementation
of pool-style designs. Perhaps not a big deal, but something to be
learned anyway? And it retains a need for the dispose() pattern?

I /think/ I prefer the simplicity of removing dtor invocation from the
GC instead (see post "GC and dtors ~ a different approach?"). How about you?
Regan Heath
2006-04-10 02:13:37 UTC
Permalink
Post by kris
Post by Mike Capp
In article <e1c6vl$moj$1 at digitaldaemon.com>, kris says...
Post by kris
I think we can safely put aside the entire-program-as-one-function as
unrealistic. Given that, and assuming the existance of a dtor implies
"auto" (and thus raii), how does one manage a "pool" of resources? For
example, how about a pool of DB connections? Let's assume that they
need to be correctly closed at some point, and that the pool is likely
to expand and contract based upon demand over time ...
So the question is how do those connections, and the pool itself, jive
with scoped raii? Assuming it doesn't, then one would presumeably
revert to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(),
and a ConnectionUsage wherever you need one. Both are RAII.
ConnectionPool acts
as a factory for ConnectionUsage instances (modulo language
limitations) and
adds to the pool as needed; ConnectionUsage just "borrows" an instance from the
pool for the duration of its scope.
cheers
Mike
Thanks!
So, when culling the pool (say, on a timeout basis) the cleanup-code for
the held resource is not held within the "borrowed" dtor, but in a
dispose() method? Otherwise, said dtor would imply raii for the borrowed
connection, which would be bogus behaviour for a class instance that is
being held onto by the pool? In other words: you'd want to avoid
deleting (via raii) the connection object, so you'd have to be careful
to not use a dtor in such a case (if we assume dtor means raii).
Unless you add a 'shared' keyword as I described in a previous post. eg.

auto class Connection { //auto required to have dtor
HANDLE h;
~this() { CloseHandle(h); }
}

class ConnectionUsage {
shared Connection c;
}

ConnectionUsage is not required to be 'auto' because it has no 'auto'
class members which are not 'shared' resources. Alternately you implement
reference counting for the Connection class, remove shared, and add 'auto'
to ConnectionUsage.

Regan
kris
2006-04-10 02:27:09 UTC
Permalink
Post by Regan Heath
Post by kris
Post by Mike Capp
In article <e1c6vl$moj$1 at digitaldaemon.com>, kris says...
Post by kris
I think we can safely put aside the entire-program-as-one-function
as unrealistic. Given that, and assuming the existance of a dtor
implies "auto" (and thus raii), how does one manage a "pool" of
resources? For example, how about a pool of DB connections? Let's
assume that they need to be correctly closed at some point, and
that the pool is likely to expand and contract based upon demand
over time ...
So the question is how do those connections, and the pool itself,
jive with scoped raii? Assuming it doesn't, then one would
presumeably revert to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g.
in main(),
and a ConnectionUsage wherever you need one. Both are RAII.
ConnectionPool acts
as a factory for ConnectionUsage instances (modulo language
limitations) and
adds to the pool as needed; ConnectionUsage just "borrows" an
instance from the
pool for the duration of its scope.
cheers
Mike
Thanks!
So, when culling the pool (say, on a timeout basis) the cleanup-code
for the held resource is not held within the "borrowed" dtor, but in
a dispose() method? Otherwise, said dtor would imply raii for the
borrowed connection, which would be bogus behaviour for a class
instance that is being held onto by the pool? In other words: you'd
want to avoid deleting (via raii) the connection object, so you'd
have to be careful to not use a dtor in such a case (if we assume
dtor means raii).
Unless you add a 'shared' keyword as I described in a previous post. eg.
auto class Connection { //auto required to have dtor
HANDLE h;
~this() { CloseHandle(h); }
}
class ConnectionUsage {
shared Connection c;
}
ConnectionUsage is not required to be 'auto' because it has no 'auto'
class members which are not 'shared' resources. Alternately you
implement reference counting for the Connection class, remove shared,
and add 'auto' to ConnectionUsage.
Regan
Yes ~ that's true.

On the other hand, all these concerns would melt away if the GC were
changed to not invoke the dtor (see related post). The beauty of that
approach is that there's no additional keywords or compiler behaviour;
only the GC is modified to remove the dtor call during a normal
collection cycle. Invoking delete or raii just works as always, yet the
invalid dtor state is eliminated. It also eliminates the need for a
dispose() pattern, which would be nice ;-)
Dave
2006-04-10 03:01:47 UTC
Permalink
In article <e1cfpo$100u$1 at digitaldaemon.com>, kris says...
Post by kris
Post by Regan Heath
Post by kris
Post by Mike Capp
In article <e1c6vl$moj$1 at digitaldaemon.com>, kris says...
Post by kris
I think we can safely put aside the entire-program-as-one-function
as unrealistic. Given that, and assuming the existance of a dtor
implies "auto" (and thus raii), how does one manage a "pool" of
resources? For example, how about a pool of DB connections? Let's
assume that they need to be correctly closed at some point, and
that the pool is likely to expand and contract based upon demand
over time ...
So the question is how do those connections, and the pool itself,
jive with scoped raii? Assuming it doesn't, then one would
presumeably revert to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g.
in main(),
and a ConnectionUsage wherever you need one. Both are RAII.
ConnectionPool acts
as a factory for ConnectionUsage instances (modulo language
limitations) and
adds to the pool as needed; ConnectionUsage just "borrows" an
instance from the
pool for the duration of its scope.
cheers
Mike
Thanks!
So, when culling the pool (say, on a timeout basis) the cleanup-code
for the held resource is not held within the "borrowed" dtor, but in
a dispose() method? Otherwise, said dtor would imply raii for the
borrowed connection, which would be bogus behaviour for a class
instance that is being held onto by the pool? In other words: you'd
want to avoid deleting (via raii) the connection object, so you'd
have to be careful to not use a dtor in such a case (if we assume
dtor means raii).
Unless you add a 'shared' keyword as I described in a previous post. eg.
auto class Connection { //auto required to have dtor
HANDLE h;
~this() { CloseHandle(h); }
}
class ConnectionUsage {
shared Connection c;
}
ConnectionUsage is not required to be 'auto' because it has no 'auto'
class members which are not 'shared' resources. Alternately you
implement reference counting for the Connection class, remove shared,
and add 'auto' to ConnectionUsage.
Regan
Yes ~ that's true.
On the other hand, all these concerns would melt away if the GC were
changed to not invoke the dtor (see related post). The beauty of that
approach is that there's no additional keywords or compiler behaviour;
only the GC is modified to remove the dtor call during a normal
collection cycle. Invoking delete or raii just works as always, yet the
invalid dtor state is eliminated. It also eliminates the need for a
dispose() pattern, which would be nice ;-)
So, 'auto' and delete would work as they do now, with the remaining problem of
people defining ~this() and it (inadvertently) never gets called, even at
program exit?

Hmmm if that's so, I'd add one thing -- how about something like a
"fullCollect(bool finalize = false)" that would be called with 'true' at the end
of dmain(), and could be explicitly called by the programmer?

That could run into the problem of dtors invoked in an invalid state, but at
least then it would still be deterministic (either the program ending normally
or the programmer calling fullCollect(true)).

BTW - I must have missed it, but what would be an example of a dtor called in an
invalid state?

Thanks,

- Dave
kris
2006-04-10 04:10:51 UTC
Permalink
Post by Dave
In article <e1cfpo$100u$1 at digitaldaemon.com>, kris says...
Post by kris
Post by Regan Heath
Post by kris
Post by Mike Capp
In article <e1c6vl$moj$1 at digitaldaemon.com>, kris says...
Post by kris
I think we can safely put aside the entire-program-as-one-function
as unrealistic. Given that, and assuming the existance of a dtor
implies "auto" (and thus raii), how does one manage a "pool" of
resources? For example, how about a pool of DB connections? Let's
assume that they need to be correctly closed at some point, and
that the pool is likely to expand and contract based upon demand
over time ...
So the question is how do those connections, and the pool itself,
jive with scoped raii? Assuming it doesn't, then one would
presumeably revert to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g.
in main(),
and a ConnectionUsage wherever you need one. Both are RAII.
ConnectionPool acts
as a factory for ConnectionUsage instances (modulo language
limitations) and
adds to the pool as needed; ConnectionUsage just "borrows" an
instance from the
pool for the duration of its scope.
cheers
Mike
Thanks!
So, when culling the pool (say, on a timeout basis) the cleanup-code
for the held resource is not held within the "borrowed" dtor, but in
a dispose() method? Otherwise, said dtor would imply raii for the
borrowed connection, which would be bogus behaviour for a class
instance that is being held onto by the pool? In other words: you'd
want to avoid deleting (via raii) the connection object, so you'd
have to be careful to not use a dtor in such a case (if we assume
dtor means raii).
Unless you add a 'shared' keyword as I described in a previous post. eg.
auto class Connection { //auto required to have dtor
HANDLE h;
~this() { CloseHandle(h); }
}
class ConnectionUsage {
shared Connection c;
}
ConnectionUsage is not required to be 'auto' because it has no 'auto'
class members which are not 'shared' resources. Alternately you
implement reference counting for the Connection class, remove shared,
and add 'auto' to ConnectionUsage.
Regan
Yes ~ that's true.
On the other hand, all these concerns would melt away if the GC were
changed to not invoke the dtor (see related post). The beauty of that
approach is that there's no additional keywords or compiler behaviour;
only the GC is modified to remove the dtor call during a normal
collection cycle. Invoking delete or raii just works as always, yet the
invalid dtor state is eliminated. It also eliminates the need for a
dispose() pattern, which would be nice ;-)
So, 'auto' and delete would work as they do now, with the remaining problem of
people defining ~this() and it (inadvertently) never gets called, even at
program exit?
Hmmm if that's so, I'd add one thing -- how about something like a
"fullCollect(bool finalize = false)" that would be called with 'true' at the end
of dmain(), and could be explicitly called by the programmer?
That could run into the problem of dtors invoked in an invalid state, but at
least then it would still be deterministic (either the program ending normally
or the programmer calling fullCollect(true)).
BTW - I must have missed it, but what would be an example of a dtor called in an
invalid state?
Thanks,
- Dave
See post entitled "GC & dtors ~ a different approach" at 6:17pm ?
Regan Heath
2006-04-10 04:08:02 UTC
Permalink
Post by kris
Post by Regan Heath
Post by kris
Post by Mike Capp
In article <e1c6vl$moj$1 at digitaldaemon.com>, kris says...
Post by kris
I think we can safely put aside the entire-program-as-one-function
as unrealistic. Given that, and assuming the existance of a dtor
implies "auto" (and thus raii), how does one manage a "pool" of
resources? For example, how about a pool of DB connections? Let's
assume that they need to be correctly closed at some point, and
that the pool is likely to expand and contract based upon demand
over time ...
So the question is how do those connections, and the pool itself,
jive with scoped raii? Assuming it doesn't, then one would
presumeably revert to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g.
in main(),
and a ConnectionUsage wherever you need one. Both are RAII.
ConnectionPool acts
as a factory for ConnectionUsage instances (modulo language
limitations) and
adds to the pool as needed; ConnectionUsage just "borrows" an
instance from the
pool for the duration of its scope.
cheers
Mike
Thanks!
So, when culling the pool (say, on a timeout basis) the cleanup-code
for the held resource is not held within the "borrowed" dtor, but in
a dispose() method? Otherwise, said dtor would imply raii for the
borrowed connection, which would be bogus behaviour for a class
instance that is being held onto by the pool? In other words: you'd
want to avoid deleting (via raii) the connection object, so you'd
have to be careful to not use a dtor in such a case (if we assume
dtor means raii).
Unless you add a 'shared' keyword as I described in a previous post. eg.
auto class Connection { //auto required to have dtor
HANDLE h;
~this() { CloseHandle(h); }
}
class ConnectionUsage {
shared Connection c;
}
ConnectionUsage is not required to be 'auto' because it has no 'auto'
class members which are not 'shared' resources. Alternately you
implement reference counting for the Connection class, remove shared,
and add 'auto' to ConnectionUsage.
Regan
Yes ~ that's true.
On the other hand, all these concerns would melt away if the GC were
changed to not invoke the dtor (see related post). The beauty of that
approach is that there's no additional keywords or compiler behaviour;
True, however the beauty is marred by the possibility of resource leaks.
I'd like to think we can come up with a solution which prevents them, or
at least makes them less likely. It would be a big step up over C++ etc
and if it takes adding a keyword and/or new compiler behaviour it's a
small price to pay IMO.
Post by kris
only the GC is modified to remove the dtor call during a normal
collection cycle. Invoking delete or raii just works as always, yet the
invalid dtor state is eliminated. It also eliminates the need for a
dispose() pattern, which would be nice ;-)
At least this idea stops people doing things they shouldn't in dtors.

What I think we need to do is come up with several concrete use-cases
(actual code) which use resources which need to be released and explore
how each suggestion would affect that code, for example I'm still not
conviced the linklist use-case mentioned here several times requires any
explicit cleanup code, isn't it all just memory to be freed by the GC? Can
someone post a code example and explain why it does please.

It seems to me that as modules already have ctor/dtors then my suggestion
can simply treat a module like a class i.e. automatically adding a dtor
(or appending to an existing dtor) which deletes the (non shared) auto
class instances at module level.

Regan
kris
2006-04-10 04:18:45 UTC
Permalink
Post by Regan Heath
Post by kris
On the other hand, all these concerns would melt away if the GC were
changed to not invoke the dtor (see related post). The beauty of that
approach is that there's no additional keywords or compiler behaviour;
True, however the beauty is marred by the possibility of resource
leaks. I'd like to think we can come up with a solution which prevents
them, or at least makes them less likely. It would be a big step up
over C++ etc and if it takes adding a keyword and/or new compiler
behaviour it's a small price to pay IMO.
Regarding leaks, please see related post entitled "GC & dtors ~ a
different approach" ?

I just hacked up the collector in Ares to do what is described in that
post. The quick hack doesn't do the leak-detection part, but the rest of
it works fine (there may well be cases I've overlooked but the obvious
ones, 'delete' and raii, now invoke the dtor whereas normal collection
does not).
Regan Heath
2006-04-10 05:08:44 UTC
Permalink
Post by kris
Post by Regan Heath
Post by kris
On the other hand, all these concerns would melt away if the GC were
changed to not invoke the dtor (see related post). The beauty of that
approach is that there's no additional keywords or compiler behaviour;
True, however the beauty is marred by the possibility of resource
leaks. I'd like to think we can come up with a solution which prevents
them, or at least makes them less likely. It would be a big step up
over C++ etc and if it takes adding a keyword and/or new compiler
behaviour it's a small price to pay IMO.
Regarding leaks, please see related post entitled "GC & dtors ~ a
different approach" ?
What about implicit cleanup? In this scenario, it doesn't happen. If you
don't explicitly (via delete or via raii) delete an >object, the dtor is
not invoked. This applies the notion that it's better to have a leak
than a dead program. The leak is a bug >to be resolved.
Whereas using my suggestion we get implicit cleanup. Auto propagates as
required, dtors are added and delete is called automatically where
required resulting in no leaks. The best part is that the compiler
enforces that by default and you have to opt-out with 'shared' to
introduce a leak.

So, assuming it's workable (Walters call) and it's not too inflexible I
think it's a better solution. In short, I would rather not have to
explicitly manage the resources if at all possible (and I still hope it
might be).

Regan
kris
2006-04-10 05:21:39 UTC
Permalink
Post by Regan Heath
Post by kris
What about implicit cleanup? In this scenario, it doesn't happen. If
you don't explicitly (via delete or via raii) delete an >object, the
dtor is not invoked. This applies the notion that it's better to have
a leak than a dead program. The leak is a bug >to be resolved.
Whereas using my suggestion we get implicit cleanup. Auto propagates as
required, dtors are added and delete is called automatically where
required resulting in no leaks. The best part is that the compiler
enforces that by default and you have to opt-out with 'shared' to
introduce a leak.
So, assuming it's workable (Walters call) and it's not too inflexible I
think it's a better solution. In short, I would rather not have to
explicitly manage the resources if at all possible (and I still hope it
might be).
I thought the idea was that classes with dtors are /intended/ to be
explicitly cleaned up? That, implicit cleanup of resources (manana, some
time) was actually a negative aspect? At least, that's what Mike was
suggesting, and it seemed like a really good idea.

Along those lines, what I was suggesting is to enable dtors for explicit
cleanup only. Plus an optional runtime leak detector. I guess I like the
simplicity of that. What you suggest seems workable too, but perhaps a
little more involved?
Regan Heath
2006-04-10 06:17:34 UTC
Permalink
Post by kris
Post by Regan Heath
Post by kris
What about implicit cleanup? In this scenario, it doesn't happen. If
you don't explicitly (via delete or via raii) delete an >object, the
dtor is not invoked. This applies the notion that it's better to have
a leak than a dead program. The leak is a bug >to be resolved.
Whereas using my suggestion we get implicit cleanup. Auto propagates
as required, dtors are added and delete is called automatically where
required resulting in no leaks. The best part is that the compiler
enforces that by default and you have to opt-out with 'shared' to
introduce a leak.
So, assuming it's workable (Walters call) and it's not too inflexible
I think it's a better solution. In short, I would rather not have to
explicitly manage the resources if at all possible (and I still hope
it might be).
I thought the idea was that classes with dtors are /intended/ to be
explicitly cleaned up?
Not my idea ;) I think any given resource has a correct time/place for
cleanup, we just need a way to specify that, ideally one that can do so
and avoid as much human error as possible (AKA resource leaks).
Post by kris
That, implicit cleanup of resources (manana, some time) was actually a
negative aspect? At least, that's what Mike was suggesting, and it
seemed like a really good idea.
It's certainly a simple solution to the problem, it may be that it's also
the best, more use-cases will convince me (at least) one way of the other.
Post by kris
Along those lines, what I was suggesting is to enable dtors for explicit
cleanup only. Plus an optional runtime leak detector. I guess I like the
simplicity of that. What you suggest seems workable too, but perhaps a
little more involved?
It's certainly more involved. It can't be done without changes to the
compiler, but, once those are in place it can guarantee resources are
cleaned up and it can guarantee no leaks occur. (assuming I'm not missing
something obvious). The price paid for that is some flexibility (perhaps,
perhaps not - I want more use-cases to try it with), I reckon the price is
worth the benefit.

Regan
Mike Capp
2006-04-10 08:19:19 UTC
Permalink
In article <e1cq0t$1fm5$1 at digitaldaemon.com>, kris says...
Post by kris
I thought the idea was that classes with dtors are /intended/ to be
explicitly cleaned up? That, implicit cleanup of resources (manana, some
time) was actually a negative aspect? At least, that's what Mike was
suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit"
to me means "without writing any code", which covers both RAII and GC cleanup
(if you're lucky). "Explicit" to me means manual calls to dtors or dispose(),
which is the worst of all possible approaches.

cheers
Mike
kris
2006-04-10 09:33:32 UTC
Permalink
Post by Mike Capp
In article <e1cq0t$1fm5$1 at digitaldaemon.com>, kris says...
Post by kris
I thought the idea was that classes with dtors are /intended/ to be
explicitly cleaned up? That, implicit cleanup of resources (manana, some
time) was actually a negative aspect? At least, that's what Mike was
suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit"
to me means "without writing any code", which covers both RAII and GC cleanup
(if you're lucky). "Explicit" to me means manual calls to dtors or dispose(),
which is the worst of all possible approaches.
Yeah, I see the murk. What would you prefer to call them? The
distinction being made there was whether the dtor was initiated via
delete/auto, versus normal collection by the GC (where the latter was
referred to as implicit).
Don Clugston
2006-04-10 11:30:46 UTC
Permalink
Post by kris
Post by Mike Capp
In article <e1cq0t$1fm5$1 at digitaldaemon.com>, kris says...
Post by kris
I thought the idea was that classes with dtors are /intended/ to be
explicitly cleaned up? That, implicit cleanup of resources (manana,
some time) was actually a negative aspect? At least, that's what Mike
was suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit"
to me means "without writing any code", which covers both RAII and GC cleanup
(if you're lucky). "Explicit" to me means manual calls to dtors or dispose(),
which is the worst of all possible approaches.
Yeah, I see the murk. What would you prefer to call them? The
distinction being made there was whether the dtor was initiated via
delete/auto, versus normal collection by the GC (where the latter was
referred to as implicit).
deterministic and non-deterministic.
Mike Capp
2006-04-10 12:29:16 UTC
Permalink
In article <e1dfmc$29r2$1 at digitaldaemon.com>, Don Clugston says...
Post by Don Clugston
Post by kris
Post by Mike Capp
Um... can we avoid using "implicit" and "explicit" in this context?
Yeah, I see the murk. What would you prefer to call them?
deterministic and non-deterministic.
Yes. Which pretty much correspond to "important" and "don't care".

cheers
Mike
kris
2006-04-10 17:57:22 UTC
Permalink
Post by Don Clugston
Post by kris
Post by Mike Capp
In article <e1cq0t$1fm5$1 at digitaldaemon.com>, kris says...
Post by kris
I thought the idea was that classes with dtors are /intended/ to be
explicitly cleaned up? That, implicit cleanup of resources (manana,
some time) was actually a negative aspect? At least, that's what
Mike was suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit"
to me means "without writing any code", which covers both RAII and GC cleanup
(if you're lucky). "Explicit" to me means manual calls to dtors or dispose(),
which is the worst of all possible approaches.
Yeah, I see the murk. What would you prefer to call them? The
distinction being made there was whether the dtor was initiated via
delete/auto, versus normal collection by the GC (where the latter was
referred to as implicit).
deterministic and non-deterministic.
Thank you;
Bruno Medeiros
2006-04-13 17:00:01 UTC
Permalink
Post by Don Clugston
Post by kris
Post by Mike Capp
In article <e1cq0t$1fm5$1 at digitaldaemon.com>, kris says...
Post by kris
I thought the idea was that classes with dtors are /intended/ to be
explicitly cleaned up? That, implicit cleanup of resources (manana,
some time) was actually a negative aspect? At least, that's what
Mike was suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit"
to me means "without writing any code", which covers both RAII and GC cleanup
(if you're lucky). "Explicit" to me means manual calls to dtors or dispose(),
which is the worst of all possible approaches.
Yeah, I see the murk. What would you prefer to call them? The
distinction being made there was whether the dtor was initiated via
delete/auto, versus normal collection by the GC (where the latter was
referred to as implicit).
deterministic and non-deterministic.
I don't like those terms. Although they are not false (because explicit
destruction is deterministic, and implicit destruction in
non-deterministic), the fact of whether the destructor was called
deterministically or non-deterministically is not in itself relevant to
this issue. What is relevant is the state of the object to be destroyed
(in defined or undefined state).

So far, I'm keeping the terms "implicit" and "explicit", as they seems
adequate to me, and I don't find at all that RAII collection is
"implicit" or "without writing any code".
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Bruno Medeiros
2006-04-13 17:06:16 UTC
Permalink
Post by Don Clugston
Post by kris
Post by Mike Capp
In article <e1cq0t$1fm5$1 at digitaldaemon.com>, kris says...
Post by kris
I thought the idea was that classes with dtors are /intended/ to be
explicitly cleaned up? That, implicit cleanup of resources (manana,
some time) was actually a negative aspect? At least, that's what
Mike was suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit"
to me means "without writing any code", which covers both RAII and GC cleanup
(if you're lucky). "Explicit" to me means manual calls to dtors or dispose(),
which is the worst of all possible approaches.
Yeah, I see the murk. What would you prefer to call them? The
distinction being made there was whether the dtor was initiated via
delete/auto, versus normal collection by the GC (where the latter was
referred to as implicit).
deterministic and non-deterministic.
I don't like those terms. Although they are not false (because
*currently* explicit destruction is deterministic, and implicit
destruction in non-deterministic), the fact of whether the destructor
was called deterministically or non-deterministically is not in itself
relevant to this issue. What is relevant is the state of the object to
be destroyed (in defined or undefined state).
Nor is implicit destruction/collection inherently non-deterministic and
vice-versa. (even if systems that operated this way would be unpractical)

So far, I'm keeping the terms "implicit" and "explicit", as they seems
adequate to me and I don't find at all that RAII collection is
"implicit" or "without writing any code".
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Don Clugston
2006-04-18 13:25:40 UTC
Permalink
Post by Bruno Medeiros
Post by Don Clugston
Post by kris
Post by Mike Capp
In article <e1cq0t$1fm5$1 at digitaldaemon.com>, kris says...
Post by kris
I thought the idea was that classes with dtors are /intended/ to be
explicitly cleaned up? That, implicit cleanup of resources (manana,
some time) was actually a negative aspect? At least, that's what
Mike was suggesting, and it seemed like a really good idea.
Um... can we avoid using "implicit" and "explicit" in this context? "Implicit"
to me means "without writing any code", which covers both RAII and GC cleanup
(if you're lucky). "Explicit" to me means manual calls to dtors or dispose(),
which is the worst of all possible approaches.
Yeah, I see the murk. What would you prefer to call them? The
distinction being made there was whether the dtor was initiated via
delete/auto, versus normal collection by the GC (where the latter was
referred to as implicit).
deterministic and non-deterministic.
I don't like those terms. Although they are not false (because
*currently* explicit destruction is deterministic, and implicit
destruction in non-deterministic), the fact of whether the destructor
was called deterministically or non-deterministically is not in itself
relevant to this issue.
I'm not sure that this is correct, see below.
Post by Bruno Medeiros
What is relevant is the state of the object to
be destroyed (in defined or undefined state).
Nor is implicit destruction/collection inherently non-deterministic and
vice-versa. (even if systems that operated this way would be unpractical)
Yes, you're right, a finaliser could be invoked immediately whenever the
last reference goes out of scope. But I think (not sure) that the issues
with finalisers would disappear if they were deterministic in this
manner. At least, I'm confident that non-deterministic scope-based
destructors would suffer from the same problems that finalisers do.
Post by Bruno Medeiros
So far, I'm keeping the terms "implicit" and "explicit", as they seems
adequate to me and I don't find at all that RAII collection is
"implicit" or "without writing any code".
However, RAII has been contrasted with "explicit" memory management for
a very long time. "Explicit" has a firmly established meaning of 'new'
and 'delete', it's very confusing to use them to mean something entirely
different. (If however, the distinction is between "gc" and "non-gc",
let's call a spade a spade).

On this topic -- there's an interesting thread on comp.c++ by Andrei
Alexandrescu about gc and RAII. Among other things, he argues that
finalisers are a flawed concept that shouldn't be included. (BTW, he
seems to be getting *very* interested in D -- he now has a link to the D
spec on his website, for example -- so his opinions are worth examining).
Sean Kelly
2006-04-18 17:22:28 UTC
Permalink
Post by Don Clugston
On this topic -- there's an interesting thread on comp.c++ by Andrei
Alexandrescu about gc and RAII. Among other things, he argues that
finalisers are a flawed concept that shouldn't be included. (BTW, he
seems to be getting *very* interested in D -- he now has a link to the D
spec on his website, for example -- so his opinions are worth examining).
This seems in line with some of the other ideas discussed in this
thread, and with what I'm trying out with this latest release of Ares.
The idea is that the runtime code will be aware of how an object is
being destroyed, be it by by the GC or by some other means. Currently,
that's as as far as it goes unless you want to modify the finalizer
function and rebuild the runtime, but the next release will include a
hookable callback in the standard library similar to onAssertError.
This will allow the user to decide upon which behavior is most
appropriate, and to do so on a per-class basis as I am planning to pass
either the original class pointer or simply a ClassInfo object. For a
debug build it may be appropriate to report the error and terminate (say
via an assert) while some release applications may want to be a bit more
lenient.

This does impose a restriction on standard library code however, as it
must behave as if non-deterministic finalization is always illegal.
This isn't terribly difficult to accomplish, but it's something to be
aware of.


Sean
Mike Capp
2006-04-18 20:53:05 UTC
Permalink
In article <e2378s$gpn$1 at digitaldaemon.com>, Sean Kelly says...
the next release [of Ares] will include a
hookable callback in the standard library similar to onAssertError.
This will allow the user to decide upon which behavior is most
appropriate, and to do so on a per-class basis as I am planning to pass
either the original class pointer or simply a ClassInfo object.
To clarify: if the decision is per-class (which I agree it should be), is there
any benefit to catching this error at runtime rather than compile time? Or is it
just that it's easier to try out this way?

cheers
Mike
Sean Kelly
2006-04-18 22:20:38 UTC
Permalink
Post by Mike Capp
In article <e2378s$gpn$1 at digitaldaemon.com>, Sean Kelly says...
the next release [of Ares] will include a
hookable callback in the standard library similar to onAssertError.
This will allow the user to decide upon which behavior is most
appropriate, and to do so on a per-class basis as I am planning to pass
either the original class pointer or simply a ClassInfo object.
To clarify: if the decision is per-class (which I agree it should be), is there
any benefit to catching this error at runtime rather than compile time? Or is it
just that it's easier to try out this way?
I'm not entirely sure it would be possible to catch every instance of
this at compile-time. That aside, I very much want to avoid anything
requiring compiler changes unless Walter is the one to implement them,
and really to avoid any fundamental changes in application behavior
without Walter's approval. This is one reason I've chosen to add this
feature via a hookable callback that defaults to existing behavior (ie.
to ignore the problem and continue). The other being that I'm not
convinced such errors always warrant termination, particularly for
release builds.

To clarify, I've added two callbacks and a user-callable function to my
local build:

void setCollectHandler( collectHandlerType h );
extern (C) void onCollectResource( ClassInfo info );

onCollectResource is called whenever the GC collects an object that has
a dtor and if not user-supplied handler is provided then the call is a
no-op. I may yet replace the ClassInfo object with an Object reference,
but haven't decided whether doing so offers much over the current version.

extern (C) void onFinalizeError( ClassInfo c, Exception e );

onFinalizeError is called whenever an Exception is thrown from an object
dtor and will effectively terminate the application with a message.
This is accomplished by wrapping the passed exception in a new
system-level exception object and re-throwing. Things get a bit weird
if e is an OutOfMemoryException, but that's a possibility I'm ignoring
for now.


Sean
Sean Kelly
2006-04-19 02:20:34 UTC
Permalink
Post by Mike Capp
In article <e2378s$gpn$1 at digitaldaemon.com>, Sean Kelly says...
the next release [of Ares] will include a hookable callback in the
standard library similar to onAssertError. This will allow the user
to decide upon which behavior is most appropriate, and to do so on a
per-class basis as I am planning to pass either the original class
pointer or simply a ClassInfo object.
To clarify: if the decision is per-class (which I agree it should be), is there
any benefit to catching this error at runtime rather than compile time? Or is it
just that it's easier to try out this way?
As per Kris' suggestion, the (future) behavior of onCollectResource in
Ares has changed slightly. The call now has the following format:

extern (C) bool onCollectResource( Object obj );

Default behavior is as before--to silently clean up the object and
continue. However, if the user has supplied a cleanup handler and it
returns 'false' then the object's dtors will not be called. Instead,
the user code is expected to have cleaned things up another way. Thus
the user has a selection of options to choose from, in order of complexity:

* Report the error and continue, returning 'true'.
* Report the error and terminate the application.
* Clean up the object's resources by some other means and return 'false'.

The final option is to allow the user to write dtors that always assume
referenced objects are valid while allowing execution to continue if
such objects are encountered by the garbage collector (currently,
dereferencing a GCed object in a dtor may cause an access violation if
the refrenced object has already been cleaned up). I'll admit that this
last option provides a lot more rope than seems prudent, but it also
makes for some interesting possibilities and I'm curious to see how
things work out :-)


Sean

Bruno Medeiros
2006-04-10 11:42:39 UTC
Permalink
Post by Regan Heath
What I think we need to do is come up with several concrete use-cases
(actual code) which use resources which need to be released and explore
how each suggestion would affect that code, for example I'm still not
conviced the linklist use-case mentioned here several times requires any
explicit cleanup code, isn't it all just memory to be freed by the GC?
Can someone post a code example and explain why it does please.
Regan
See my reply to Georg:
news://news.digitalmars.com:119/e1dg8t$2akn$2 at digitaldaemon.com
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Sean Kelly
2006-04-10 16:30:49 UTC
Permalink
Post by kris
On the other hand, all these concerns would melt away if the GC were
changed to not invoke the dtor (see related post). The beauty of that
approach is that there's no additional keywords or compiler behaviour;
only the GC is modified to remove the dtor call during a normal
collection cycle. Invoking delete or raii just works as always, yet the
invalid dtor state is eliminated. It also eliminates the need for a
dispose() pattern, which would be nice ;-)
For what it's worth, I think this could be accomplished now (thogh I've
not tried it) as follows:

Object o = new MyObject;
gc_setFinalizer( o, null );


Sean
kris
2006-04-10 17:56:53 UTC
Permalink
Post by Sean Kelly
Post by kris
On the other hand, all these concerns would melt away if the GC were
changed to not invoke the dtor (see related post). The beauty of that
approach is that there's no additional keywords or compiler behaviour;
only the GC is modified to remove the dtor call during a normal
collection cycle. Invoking delete or raii just works as always, yet
the invalid dtor state is eliminated. It also eliminates the need for
a dispose() pattern, which would be nice ;-)
For what it's worth, I think this could be accomplished now (thogh I've
Object o = new MyObject;
gc_setFinalizer( o, null );
Nearly, but not quite the same. This certainly disables the dtor for the
given class, but if you forget to do it, your dtor will called with an
'unspecified' (what Don called non-deterministic) state. Plus, there's
no option for capturing leaks.

I believe it's far better to stop the GC from invoking the dtor in those
cases where the state is unspecified: the system would become fully
deterministic, the need for a dispose() pattern goes away ('delete'/raii
takes over), expensive resources that should be released quickly are
always treated in that manner (consistently) or treated as leaks
otherwise, and the GC runs a little faster.

There's the edge-case whereby someone wants a dtor to be invoked lazily
by the collector, at some point in the future. That puts us back into
the non-deterministic dtor state, and is a model that Mike was
suggesting should be removed anyway (because classes that need to
release something should do so as quickly as possible). I fully agree
with Mike on this aspect, but wonder whether a simple implementation
might suffice instead (GC change only)?

Essentially what I'm suggesting is adding this to the documentation:

"a class dtor is invoked via the use of 'delete' or raii only. This
guarantees that (a) classes holding external or otherwise "expensive"
resources will release them in a timely manner, that (b) that the dtor
will be invoked with a fully deterministic state ~ all memory references
held by a class instance will be valid when the dtor is invoked, and
(c) there's no need for redundant cleanup-patterns such as dispose()"

- Kris
Dave
2006-04-10 01:36:02 UTC
Permalink
In article <e1c9tc$p14$1 at digitaldaemon.com>, Mike Capp says...
Post by Mike Capp
In article <e1c6vl$moj$1 at digitaldaemon.com>, kris says...
Post by kris
I think we can safely put aside the entire-program-as-one-function as
unrealistic. Given that, and assuming the existance of a dtor implies
"auto" (and thus raii), how does one manage a "pool" of resources? For
example, how about a pool of DB connections? Let's assume that they need
to be correctly closed at some point, and that the pool is likely to
expand and contract based upon demand over time ...
So the question is how do those connections, and the pool itself, jive
with scoped raii? Assuming it doesn't, then one would presumeably revert
to a manual dispose() pattern with such things?
Two different classes. A ConnectionPool at application scope, e.g. in main(),
and a ConnectionUsage wherever you need one. Both are RAII. ConnectionPool acts
as a factory for ConnectionUsage instances (modulo language limitations) and
adds to the pool as needed; ConnectionUsage just "borrows" an instance from the
pool for the duration of its scope.
cheers
Mike
That's a mssing part of the puzzle - up until now IMO the changes to the
compiler would have been minimal to support "only autos can have dtors".

Now it would require another change to the language in that 'auto' is not
currently allowed for module scope classes. To support that, I guess there would
have to be code inserted along the lines of module static dtors for auto class
objects declared at module scope (except it would have to also check that each
class had been actually instantiated, obviously).

Then I guess there would be good potential for circular reference problems like
there is for module ctors and dtors with imported modules. So the compiler would
then have to insert runtime checks like it does for module ctors now, which make
things yet more complicated.
Loading...