Discussion:
Code Review for WeakReference leak in the Logging API (6942989)
Daniel D. Daugherty
2010-06-10 17:13:39 UTC
Permalink
Greetings,

I need a couple of code reviews for my fix for the WeakReference leak
in the Logging API. The webrev is relative to OpenJDK7, but the bug
is escalated so the fix will be backported to the JDK6-Update train.
That's why I need at least two code reviewers.

Here is the URL for the webrev:

http://cr.openjdk.java.net/~dcubed/6942989-webrev/0/

For some reason, the OpenJDK bug link isn't working which is strange
considering the bug report came in from forums.java.net.

Thanks, in advance, for any reviews.

Dan
David Holmes
2010-06-11 01:08:10 UTC
Permalink
Hi Dan,

I'm surprised to see that the VM team "owns" this one :)
Post by Daniel D. Daugherty
I need a couple of code reviews for my fix for the WeakReference leak
in the Logging API. The webrev is relative to OpenJDK7, but the bug
is escalated so the fix will be backported to the JDK6-Update train.
That's why I need at least two code reviewers.
http://cr.openjdk.java.net/~dcubed/6942989-webrev/0/
I looked at the code and I read the CR and I understand what you are
doing - and in that sense it seems ok.

It does seem very complicated (but the logger management is complicated
:( ).

I wonder about the performance implications of doing this search each
time a logger is added? These types of cleanups are always a fine line
between minimizing cleanup up on the main path, and ensuring too much
garbage doesn't accumulate. I'm a little concerned this is a lot of
clean up code (potentially) on the main path.

David
Daniel D. Daugherty
2010-06-11 01:50:16 UTC
Permalink
Post by David Holmes
Hi Dan,
I'm surprised to see that the VM team "owns" this one :)
I'm on the "Serviceability Team". We get all the strange
technology areas... :-)
Post by David Holmes
Post by Daniel D. Daugherty
I need a couple of code reviews for my fix for the WeakReference leak
in the Logging API. The webrev is relative to OpenJDK7, but the bug
is escalated so the fix will be backported to the JDK6-Update train.
That's why I need at least two code reviewers.
http://cr.openjdk.java.net/~dcubed/6942989-webrev/0/
I looked at the code and I read the CR and I understand what you are
doing - and in that sense it seems ok.
It does seem very complicated (but the logger management is
complicated :( ).
You're being way too polite about the Logger API... :-)
Post by David Holmes
I wonder about the performance implications of doing this search each
time a logger is added? These types of cleanups are always a fine line
between minimizing cleanup up on the main path, and ensuring too much
garbage doesn't accumulate. I'm a little concerned this is a lot of
clean up code (potentially) on the main path.
Agreed. In this particular case, I'm slamming down hard on the
correctness side. The theory is that there shouldn't be too many
Logger objects in the normal system and once you've added them,
then this fix doesn't come into play.

Can I put you down as a "thumbs up"?

Dan
David Holmes
2010-06-11 02:10:43 UTC
Permalink
Post by Daniel D. Daugherty
Post by David Holmes
I wonder about the performance implications of doing this search each
time a logger is added? These types of cleanups are always a fine line
between minimizing cleanup up on the main path, and ensuring too much
garbage doesn't accumulate. I'm a little concerned this is a lot of
clean up code (potentially) on the main path.
Agreed. In this particular case, I'm slamming down hard on the
correctness side. The theory is that there shouldn't be too many
Logger objects in the normal system and once you've added them,
then this fix doesn't come into play.
So the original problem was exposed through excessive use of anonymous
loggers ... but now if those loggers are no longer referenced the next
anonymus logger created will trigger clean up of all such unreferenced
loggers (well assuming the weak-ref has been cleared), so the set of
children to traverse should be kept small.

But I still wonder if there are apps out there that will create large
logger hierarchies ...
Post by Daniel D. Daugherty
Can I put you down as a "thumbs up"?
Yes. :)

David
Post by Daniel D. Daugherty
Dan
Daniel D. Daugherty
2010-06-11 02:35:57 UTC
Permalink
Post by David Holmes
Post by Daniel D. Daugherty
Post by David Holmes
I wonder about the performance implications of doing this search
each time a logger is added? These types of cleanups are always a
fine line between minimizing cleanup up on the main path, and
ensuring too much garbage doesn't accumulate. I'm a little concerned
this is a lot of clean up code (potentially) on the main path.
Agreed. In this particular case, I'm slamming down hard on the
correctness side. The theory is that there shouldn't be too many
Logger objects in the normal system and once you've added them,
then this fix doesn't come into play.
So the original problem was exposed through excessive use of anonymous
loggers ... but now if those loggers are no longer referenced the next
anonymus logger created will trigger clean up of all such unreferenced
loggers (well assuming the weak-ref has been cleared), so the set of
children to traverse should be kept small.
The test case submitted by the customer used anonymous Loggers.
However, I believe the customer's original app used regular Loggers,
and they noticed the increasing WeakReference memory usage over a
long while. They created the anonymous Loggers test case as a way
of quickly reproducing a WeakReference leak. I have a query in about
their real usage pattern, but I haven't heard back.

One of leaking WeakReference code paths is common to both anonymous
and regular Loggers. So if something, e.g., an applet, is creating
massive numbers of anonymous Loggers, then creating either an
anonymous or regular Logger will cleanup up any stale WeakReferences
on the common code path.

The second leaking WeakReference code path is only used by regular
Loggers. So only the creation of a regular Logger can cleanup any
stale WeakReferences on that code path.

I think their app is also suffering from not saving a strong Logger
reference so the regular Loggers are being periodically GC'ed and
then recreated. A common (broken) usage of Logging is:

Logger.getLogger("MyLogger").log("some message");

By never hanging onto a strong Logger reference, the Logger named
"MyLogger" will eventually be GC'ed and then recreated the next
time that it is used. See the following bug for gory details:

6949710 3/3 the GC'able nature of Logging objects needs to be
made brutally clear

When I finish writing the JavaDoc updates for that one, perhaps
you'll be kind enough to be one of my reviewers... :-)
Post by David Holmes
But I still wonder if there are apps out there that will create large
logger hierarchies ...
Maybe, but I'll deal with that as a performance fix rather than
an escalated memory leak.
Post by David Holmes
Post by Daniel D. Daugherty
Can I put you down as a "thumbs up"?
Yes. :)
Thanks! And thanks for the thoughtful review!

Dan
Alan Bateman
2010-06-11 13:18:34 UTC
Permalink
Post by Daniel D. Daugherty
http://cr.openjdk.java.net/~dcubed/6942989-webrev/0/
I went through the changes too.

I agree with the performance concern with expunging stale entries each
time a LogManager is created. Would it make sense to do this once every
N times rather than every time? Or maybe keep a timestamp so that you
limit the number of times your attempt to expunge in some time period?
(say a maximum of once a minute).

Does LogManager.deleteStaleWeakRefs need to be synchronized with any
other operations?

I wonder if Logger.kidsCleanupMarker is the best name for this [ kids
and markers usually involve clean-up but that's not what is meant here
:-) ]. Maybe rename to something that includes the word "sentinel"?

Is is necessary to use jmap in the regression test? Would it be simpler
to have a pure java test that runs with a small heap size? If it leaks
then it fails with OOME.

-Alan.
Daniel D. Daugherty
2010-06-11 14:45:34 UTC
Permalink
Post by Alan Bateman
Post by Daniel D. Daugherty
http://cr.openjdk.java.net/~dcubed/6942989-webrev/0/
I went through the changes too.
Thanks!
Post by Alan Bateman
I agree with the performance concern with expunging stale entries each
time a LogManager is created.
I think you mean Logger object here. Typically, there is only one
LogManager...
Post by Alan Bateman
Would it make sense to do this once every N times rather than every
time? Or maybe keep a timestamp so that you limit the number of times
your attempt to expunge in some time period? (say a maximum of once a
minute).
Either of those schemes would be fine, but not for this fix and
not without a good reason to do so. The theory is that there
shouldn't be too many Logger objects in a normal system and
once they have been added, then this fix doesn't come into play.
I would be surprised if a real system had more than 100 Logger
objects.
Post by Alan Bateman
Does LogManager.deleteStaleWeakRefs need to be synchronized with any
other operations?
That's actually LogManager.LogNode.deleteStaleWeakRefs(). It's
only called from LogManager.addLogger() which is synchronized.
So the LogManager instance should be locked and no other thread
can be adding a Logger at the same time that we're cleaning up
any stale weak refs. Similarly, walkAndSetParent() is only called
from addLogger() so that (opportunistic) cleanup won't conflict
with the more complete deleteStaleWeakRefs().
Post by Alan Bateman
I wonder if Logger.kidsCleanupMarker is the best name for this [ kids
and markers usually involve clean-up but that's not what is meant here
:-) ]. Maybe rename to something that includes the word "sentinel"?
So would "kidsCleanupSentinel" be better? (I actually rather liked
the implied humor of kids and markers... :-))
Post by Alan Bateman
Is is necessary to use jmap in the regression test? Would it be
simpler to have a pure java test that runs with a small heap size? If
it leaks then it fails with OOME.
By using "jmap -histo:live" I'm doing a targeted sample of just
WeakReference leaks and that's what I want for this regression
test. If I was creating a general "does the Logging API leak
memory?" test, then the pure Java test with a small/limited heap
size would be a good approach.

Can I put you down as a "thumbs up"? And do you want me to do the
kidsCleanupMarker to kidsCleanupSentinel rename?

Dan

P.S.
The built-in spellchecker wants to rename you to "Batman". :-)
Mandy Chung
2010-06-11 20:09:36 UTC
Permalink
The theory is that there shouldn't be too many Logger objects in a
normal system and
once they have been added, then this fix doesn't come into play.
I would be surprised if a real system had more than 100 Logger
objects.
FYI. AWT creates a number of loggers (see
http://bugs.sun.com/view_bug.do?bug_id=6880089)

Running a simple Framer application (awt helloworld) with JDK 7 b70
creates 79 loggers on solaris-i586 and 34 loggers on windows-i586.
SwingSet2 creates a total of 85 loggers including a few non-awt ones on
solaris-i586 and 35 on windows-i586).

Mandy
Daniel D. Daugherty
2010-06-14 16:30:33 UTC
Permalink
Post by Mandy Chung
The theory is that there shouldn't be too many Logger objects in a
normal system and
once they have been added, then this fix doesn't come into play.
I would be surprised if a real system had more than 100 Logger
objects.
FYI. AWT creates a number of loggers (see
http://bugs.sun.com/view_bug.do?bug_id=6880089)
Running a simple Framer application (awt helloworld) with JDK 7 b70
creates 79 loggers on solaris-i586 and 34 loggers on windows-i586.
SwingSet2 creates a total of 85 loggers including a few non-awt ones
on solaris-i586 and 35 on windows-i586).
Thanks for the numbers. I still don't consider these numbers of
Loggers to be a lot. However, the point will be possible moot
depending on what happens with my investigation of ReferenceQueues
and Jeremy Manson's fix for this same problem.

Stay tuned!

Dan

P.S.
Mandy, did you also review the code?
Mandy Chung
2010-06-14 17:11:10 UTC
Permalink
Post by Daniel D. Daugherty
Post by Mandy Chung
The theory is that there shouldn't be too many Logger objects in a
normal system and
once they have been added, then this fix doesn't come into play.
I would be surprised if a real system had more than 100 Logger
objects.
FYI. AWT creates a number of loggers (see
http://bugs.sun.com/view_bug.do?bug_id=6880089)
Running a simple Framer application (awt helloworld) with JDK 7 b70
creates 79 loggers on solaris-i586 and 34 loggers on windows-i586.
SwingSet2 creates a total of 85 loggers including a few non-awt ones
on solaris-i586 and 35 on windows-i586).
Thanks for the numbers. I still don't consider these numbers of
Loggers to be a lot.
These numbers are not a lot. Just to show as an example that there
might be applications out there doing similar thing (one Logger per
class) that you can take into consideration.
Post by Daniel D. Daugherty
However, the point will be possible moot
depending on what happens with my investigation of ReferenceQueues
and Jeremy Manson's fix for this same problem.
Stay tuned!
Dan
P.S.
Mandy, did you also review the code?
No, I didn't.

Mandy
Alan Bateman
2010-06-11 20:39:48 UTC
Permalink
Post by Daniel D. Daugherty
Either of those schemes would be fine, but not for this fix and
not without a good reason to do so. The theory is that there
shouldn't be too many Logger objects in a normal system and
once they have been added, then this fix doesn't come into play.
I would be surprised if a real system had more than 100 Logger
objects.
I'm sure this is mostly true but there will be extreme cases where it
might be a problem. I remember Mandy discovering that AWT has a Logger
per class so you might want to check that the changes don't impact
startup - as I recall all the AWT Loggers are created in static
initializers (discussed on awt-dev a while back).
Post by Daniel D. Daugherty
So would "kidsCleanupSentinel" be better? (I actually rather liked
the implied humor of kids and markers... :-))
It is funny. I don't have a strong preference. One idea is to give it a
simple name and declare it close to the method so that it is clear that
this is the only place where it is used (your call).
Post by Daniel D. Daugherty
By using "jmap -histo:live" I'm doing a targeted sample of just
WeakReference leaks and that's what I want for this regression
test. If I was creating a general "does the Logging API leak
memory?" test, then the pure Java test with a small/limited heap
size would be a good approach.
I understand but just concerned that these multi-process tests have a
tendency to be problematic. A simple pure-java test that specifies a
Post by Daniel D. Daugherty
Can I put you down as a "thumbs up"?
Yes, I'm okay with what you have assuming that it doesn't impact AWT
startup. Also Éamonn makes a good point (so maybe create a bug so that
they code can be replaced/rewritten in jdk7?).

-Alan.
Daniel D. Daugherty
2010-06-14 16:37:56 UTC
Permalink
Post by Alan Bateman
Post by Daniel D. Daugherty
Either of those schemes would be fine, but not for this fix and
not without a good reason to do so. The theory is that there
shouldn't be too many Logger objects in a normal system and
once they have been added, then this fix doesn't come into play.
I would be surprised if a real system had more than 100 Logger
objects.
I'm sure this is mostly true but there will be extreme cases where it
might be a problem. I remember Mandy discovering that AWT has a Logger
per class so you might want to check that the changes don't impact
startup - as I recall all the AWT Loggers are created in static
initializers (discussed on awt-dev a while back).
Mandy posted numbers just before this post in the e-mail thread.
I still don't consider those numbers to be "a lot" of Loggers,
but the point is likely moot depending on the results of my
investigation into ReferenceQueues and Jeremy Manson's fix for
this same problem.
Post by Alan Bateman
Post by Daniel D. Daugherty
So would "kidsCleanupSentinel" be better? (I actually rather liked
the implied humor of kids and markers... :-))
It is funny. I don't have a strong preference. One idea is to give it
a simple name and declare it close to the method so that it is clear
that this is the only place where it is used (your call).
Hold that thought since this field will likely be going away...
Post by Alan Bateman
Post by Daniel D. Daugherty
By using "jmap -histo:live" I'm doing a targeted sample of just
WeakReference leaks and that's what I want for this regression
test. If I was creating a general "does the Logging API leak
memory?" test, then the pure Java test with a small/limited heap
size would be a good approach.
I understand but just concerned that these multi-process tests have a
tendency to be problematic. A simple pure-java test that specifies a
No argument about robustness, but I think I've done a pretty
good job of guarding the test against spurious failures. And
if it does fail strangely there are good diagnostics (IMHO).

I have different goals for the test. I only want to catch
these two fish in this net. The pure Java test may give me
more potential issues to investigate or rule out and I don't
have time for that right now.
Post by Alan Bateman
Post by Daniel D. Daugherty
Can I put you down as a "thumbs up"?
Yes, I'm okay with what you have assuming that it doesn't impact AWT
startup. Also Éamonn makes a good point (so maybe create a bug so
that they code can be replaced/rewritten in jdk7?).
Stay tuned. It may be rewritten today. :-)

Dan
Eamonn McManus
2010-06-11 16:36:49 UTC
Permalink
I think an alternative approach to the one here would be to use a global
ReferenceQueue and a subclass of WeakReference that has a pointer back
to the Logger or LogNode that contains this WeakReference. Then, in the
cases where you currently check for stale entries, you could simply poll
the ReferenceQueue and visit any Logger or LogNode that it is telling
you has a cleared WeakReference.
There are a couple of subtle points, though. First, you'll need to be
careful about synchronization before clearing WeakReferences, since the
Logger you're being told about isn't necessarily related to the current
one. Logger.kids is protected by a global lock treeLock but
LogManager.loggers is protected by the LogManager's lock, so when
logManager1.addLogger detects that a logger belonging to logManager2 has
a stale WeakReference, it needs to drop the logManager1 lock and acquire
the logManager2 one before scanning the reference list. (This means that
addLogger has to change from being a synchronized method to have a big
synchronized(this) block with a non-synchronized tail-end.)
Second, since apparently the user can introduce loops in the Logger
hierarchy, the reference from the WeakReference subclass back to the
parent Logger will itself need to be a WeakReference.
Nevertheless, I think this approach is likely to be less complicated
than the proposed one, notably because there's no need to protect
against infinite recursion; and the performance concerns should be less
because you only look for cleared WeakReferences when you know they are
there, and you know where to look.
Éamonn
Post by Alan Bateman
Post by Daniel D. Daugherty
http://cr.openjdk.java.net/~dcubed/6942989-webrev/0/
I went through the changes too.
I agree with the performance concern with expunging stale entries each
time a LogManager is created. Would it make sense to do this once
every N times rather than every time? Or maybe keep a timestamp so
that you limit the number of times your attempt to expunge in some
time period? (say a maximum of once a minute).
Does LogManager.deleteStaleWeakRefs need to be synchronized with any
other operations?
I wonder if Logger.kidsCleanupMarker is the best name for this [ kids
and markers usually involve clean-up but that's not what is meant here
:-) ]. Maybe rename to something that includes the word "sentinel"?
Is is necessary to use jmap in the regression test? Would it be
simpler to have a pure java test that runs with a small heap size? If
it leaks then it fails with OOME.
-Alan.
Alan Bateman
2010-06-11 20:02:40 UTC
Permalink
Post by Eamonn McManus
I think an alternative approach to the one here would be to use a
global ReferenceQueue and a subclass of WeakReference that has a
pointer back to the Logger or LogNode that contains this
WeakReference. Then, in the cases where you currently check for stale
entries, you could simply poll the ReferenceQueue and visit any Logger
or LogNode that it is telling you has a cleared WeakReference.
There are a couple of subtle points, though. First, you'll need to be
careful about synchronization before clearing WeakReferences, since
the Logger you're being told about isn't necessarily related to the
current one. Logger.kids is protected by a global lock treeLock but
LogManager.loggers is protected by the LogManager's lock, so when
logManager1.addLogger detects that a logger belonging to logManager2
has a stale WeakReference, it needs to drop the logManager1 lock and
acquire the logManager2 one before scanning the reference list. (This
means that addLogger has to change from being a synchronized method to
have a big synchronized(this) block with a non-synchronized tail-end.)
Second, since apparently the user can introduce loops in the Logger
hierarchy, the reference from the WeakReference subclass back to the
parent Logger will itself need to be a WeakReference.
Nevertheless, I think this approach is likely to be less complicated
than the proposed one, notably because there's no need to protect
against infinite recursion; and the performance concerns should be
less because you only look for cleared WeakReferences when you know
they are there, and you know where to look.
Éamonn
This make sense as reference queues are the right way to do this. As I
understand it, Dan is looking for a low-risk solution now and so maybe
the right thing is to get that out of the way and then spend time on a
better solution for jdk7.

-Alan.
Daniel D. Daugherty
2010-06-14 16:28:08 UTC
Permalink
Post by Alan Bateman
Post by Eamonn McManus
I think an alternative approach to the one here would be to use a
global ReferenceQueue and a subclass of WeakReference that has a
pointer back to the Logger or LogNode that contains this
WeakReference. Then, in the cases where you currently check for stale
entries, you could simply poll the ReferenceQueue and visit any
Logger or LogNode that it is telling you has a cleared WeakReference.
There are a couple of subtle points, though. First, you'll need to be
careful about synchronization before clearing WeakReferences, since
the Logger you're being told about isn't necessarily related to the
current one. Logger.kids is protected by a global lock treeLock but
LogManager.loggers is protected by the LogManager's lock, so when
logManager1.addLogger detects that a logger belonging to logManager2
has a stale WeakReference, it needs to drop the logManager1 lock and
acquire the logManager2 one before scanning the reference list. (This
means that addLogger has to change from being a synchronized method
to have a big synchronized(this) block with a non-synchronized
tail-end.)
Second, since apparently the user can introduce loops in the Logger
hierarchy, the reference from the WeakReference subclass back to the
parent Logger will itself need to be a WeakReference.
Nevertheless, I think this approach is likely to be less complicated
than the proposed one, notably because there's no need to protect
against infinite recursion; and the performance concerns should be
less because you only look for cleared WeakReferences when you know
they are there, and you know where to look.
Éamonn
This make sense as reference queues are the right way to do this. As I
understand it, Dan is looking for a low-risk solution now and so maybe
the right thing is to get that out of the way and then spend time on a
better solution for jdk7.
I'll be investigating ReferenceQueues and Jeremy Manson's fix for
this problem today. It's entirely possible that we'll drop my fix
in favor of something better.

Stay tuned...

Dan
Daniel D. Daugherty
2010-06-14 16:26:20 UTC
Permalink
Eamonn,

Thanks for the review! I didn't know about ReferenceQueues so that's
a very interesting idea. I guess I need to get out of the VM codebase
a little more often... :-)

Jumping ahead in the e-mail thread, Jeremy Manson from Google has
offered the use of their fix for the problem. It looks like it uses
ReferenceQueues so I'll be taking a closer look at that today also.

More below...
Post by Eamonn McManus
I think an alternative approach to the one here would be to use a
global ReferenceQueue and a subclass of WeakReference that has a
pointer back to the Logger or LogNode that contains this
WeakReference. Then, in the cases where you currently check for stale
entries, you could simply poll the ReferenceQueue and visit any Logger
or LogNode that it is telling you has a cleared WeakReference.
ReferenceQueue is a very cool class.
Post by Eamonn McManus
There are a couple of subtle points, though. First, you'll need to be
careful about synchronization before clearing WeakReferences, since
the Logger you're being told about isn't necessarily related to the
current one. Logger.kids is protected by a global lock treeLock but
LogManager.loggers is protected by the LogManager's lock, so when
logManager1.addLogger detects that a logger belonging to logManager2
has a stale WeakReference, it needs to drop the logManager1 lock and
acquire the logManager2 one before scanning the reference list. (This
means that addLogger has to change from being a synchronized method to
have a big synchronized(this) block with a non-synchronized tail-end.)
So you're proposing using ReferenceQueues for all of the stuff
that uses WeakReferences in the Logging API and not just for the
two leaks that I've found. Consistency is a good idea so I'll
keep that in mind also.
Post by Eamonn McManus
Second, since apparently the user can introduce loops in the Logger
hierarchy, the reference from the WeakReference subclass back to the
parent Logger will itself need to be a WeakReference.
This subtle point is a little too subtle for my brain right now, but
I'll try to keep it in mind also. Since I've got an VM/NSK test that
blows up when the loops aren't accounted for, I'll have a reminder.
Post by Eamonn McManus
Nevertheless, I think this approach is likely to be less complicated
than the proposed one, notably because there's no need to protect
against infinite recursion; and the performance concerns should be
less because you only look for cleared WeakReferences when you know
they are there, and you know where to look.
I don't think I'll be able to convince anyone that a fix using
ReferenceQueues is less complicated. The fix I wrote is a pretty
simple, big hammer approach. ReferenceQueues, by your own proposal,
introduce subtleties that are less than obvious.

However, your proposal to use ReferenceQueues is a good idea and I'll
investigate it today. It helps that Jeremy Manson's fix also uses
ReferenceQueues.

Dan
Post by Eamonn McManus
Éamonn
Post by Alan Bateman
Post by Daniel D. Daugherty
http://cr.openjdk.java.net/~dcubed/6942989-webrev/0/
I went through the changes too.
I agree with the performance concern with expunging stale entries
each time a LogManager is created. Would it make sense to do this
once every N times rather than every time? Or maybe keep a timestamp
so that you limit the number of times your attempt to expunge in some
time period? (say a maximum of once a minute).
Does LogManager.deleteStaleWeakRefs need to be synchronized with any
other operations?
I wonder if Logger.kidsCleanupMarker is the best name for this [ kids
and markers usually involve clean-up but that's not what is meant
here :-) ]. Maybe rename to something that includes the word
"sentinel"?
Is is necessary to use jmap in the regression test? Would it be
simpler to have a pure java test that runs with a small heap size? If
it leaks then it fails with OOME.
-Alan.
Daniel D. Daugherty
2010-06-16 04:27:04 UTC
Permalink
Revisiting this e-mail now that I've completed a first pass
at a ReferenceQueue implementation.
Post by Eamonn McManus
I think an alternative approach to the one here would be to use a
global ReferenceQueue and a subclass of WeakReference that has a
pointer back to the Logger or LogNode that contains this WeakReference.
The WeakReference subclass that I created manages weaks refs
in LogManager.loggers (now named LogManager.namedLoggers),
LogManager.LogNode.loggerRef and Logger.kids. I got the idea
of managing LogManager.loggers from Jeremy's fix.
Post by Eamonn McManus
Then, in the cases where you currently check for stale entries, you
could simply poll the ReferenceQueue and visit any Logger or LogNode
that it is telling you has a cleared WeakReference.
I created a LogManager.drainLoggerRefQueueBounded() method based on
Tony's paper:

http://java.sun.com/developer/technicalArticles/javase/finalization/

and I'm calling it on the LogManager.addLogger() and the
Logger.getAnonymousLogger(String) code paths.
Post by Eamonn McManus
There are a couple of subtle points, though. First, you'll need to be
careful about synchronization before clearing WeakReferences, since
the Logger you're being told about isn't necessarily related to the
current one. Logger.kids is protected by a global lock treeLock but
LogManager.loggers is protected by the LogManager's lock, so when
logManager1.addLogger detects that a logger belonging to logManager2
has a stale WeakReference, it needs to drop the logManager1 lock and
acquire the logManager2 one before scanning the reference list. (This
means that addLogger has to change from being a synchronized method to
have a big synchronized(this) block with a non-synchronized tail-end.)
There is only one LogManager so I don't have to worry about redoing
the LogManager synchronization stuff. drainLoggerRefQueueBounded()
isn't called from the context of a Logger object so I don't have to
worry about a current Logger versus other Logger context collision.
Post by Eamonn McManus
Second, since apparently the user can introduce loops in the Logger
hierarchy, the reference from the WeakReference subclass back to the
parent Logger will itself need to be a WeakReference.
I did make the parent Logger reference a weak reference, but not
because of loops. I did it because I didn't want to add a strong
reference to the parent Logger and keep it from being GC'ed. Sorry,
I still can't figure out what loops have to do with it.
Post by Eamonn McManus
Nevertheless, I think this approach is likely to be less complicated
than the proposed one,
I'm still not convinced this approach is less complicated, but
it is definitely a much better approach :-)
Post by Eamonn McManus
notably because there's no need to protect against infinite recursion;
No more need for protecting against infinite recursion.
Post by Eamonn McManus
and the performance concerns should be less because you only look for
cleared WeakReferences when you know they are there, and you know
where to look.
Definitely less performance concerns. I also only cleanup a set
maximum of GC'ed Logger objects per call. Tony P's paper also
recommended this and the drain*Bounded() name too.
Post by Eamonn McManus
Éamonn
Thanks again for the ReferenceQueue idea and for your thorough review.

I'm running my new version through VM/NSK logging tests right now;
it passed the SDK/JDK LOGGING_REGRESSION tests. I'll make one more
sanity check code review after that and send the revised bits out
for a re-review...

Dan

Jeremy Manson
2010-06-11 18:55:53 UTC
Permalink
We also fixed this bug internally at Google (sending the patch out was
a TODO, but we never got around to it). If you have any interest in
comparing / contrasting approaches, let us know.

Jeremy

On Thu, Jun 10, 2010 at 10:13 AM, Daniel D. Daugherty
Post by Daniel D. Daugherty
Greetings,
I need a couple of code reviews for my fix for the WeakReference leak
in the Logging API. The webrev is relative to OpenJDK7, but the bug
is escalated so the fix will be backported to the JDK6-Update train.
That's why I need at least two code reviewers.
  http://cr.openjdk.java.net/~dcubed/6942989-webrev/0/
For some reason, the OpenJDK bug link isn't working which is strange
considering the bug report came in from forums.java.net.
Thanks, in advance, for any reviews.
Dan
Daniel D. Daugherty
2010-06-11 21:46:32 UTC
Permalink
Jeremy,

I'm definitely interested in learning about your approach to this issue.

Dan
Post by Jeremy Manson
We also fixed this bug internally at Google (sending the patch out was
a TODO, but we never got around to it). If you have any interest in
comparing / contrasting approaches, let us know.
Jeremy
On Thu, Jun 10, 2010 at 10:13 AM, Daniel D. Daugherty
Post by Daniel D. Daugherty
Greetings,
I need a couple of code reviews for my fix for the WeakReference leak
in the Logging API. The webrev is relative to OpenJDK7, but the bug
is escalated so the fix will be backported to the JDK6-Update train.
That's why I need at least two code reviewers.
http://cr.openjdk.java.net/~dcubed/6942989-webrev/0/
For some reason, the OpenJDK bug link isn't working which is strange
considering the bug report came in from forums.java.net.
Thanks, in advance, for any reviews.
Dan
Martin Buchholz
2010-06-11 22:41:23 UTC
Permalink
On Fri, Jun 11, 2010 at 14:46, Daniel D. Daugherty
Post by Daniel D. Daugherty
Jeremy,
I'm definitely interested in learning about your approach to this issue.
Here's the patch against openjdk6 by Jeremy.
http://cr.openjdk.java.net/~martin/WeakLogger-jeremymanson.patch
(It would take a bit of merging to port to openjdk7)

Feel free to take anything from our change.
Apologies for the perforce-isms.

Martin
Daniel D. Daugherty
2010-06-14 16:46:20 UTC
Permalink
Post by Martin Buchholz
On Fri, Jun 11, 2010 at 14:46, Daniel D. Daugherty
Post by Daniel D. Daugherty
Jeremy,
I'm definitely interested in learning about your approach to this issue.
Here's the patch against openjdk6 by Jeremy.
http://cr.openjdk.java.net/~martin/WeakLogger-jeremymanson.patch
(It would take a bit of merging to port to openjdk7)
Feel free to take anything from our change.
Apologies for the perforce-isms.
Martin
Jeremy and Martin,

Thanks for the proposed fix. A couple of questions:

- This changeset is private to Google right now, correct? As in
it hasn't made it into OpenJDK6 yet.
- Do you plan on pushing this changeset to OpenJDK6?
- What kind of testing has been done on it?

Thanks for the offer for the code. I'll start wading through the
diffs today. Because this is an escalated issue, I will likely
be taking just the code and comments directly related to the
problem at hand. The JavaDoc fixes, even though they are useful,
will have to wait for a different changeset.

Thanks for jumping in on this thread.

Dan
Jeremy Manson
2010-06-14 17:30:42 UTC
Permalink
Daniel,

The fix hasn't made it to OpenJDK6. We were planning on pushing it to
OpenJDK6/7, but we haven't had time for it yet. If your fix is better
(I haven't had a chance to look at it), then we'll happily drop ours
in favor of yours.

For testing: I hand tested it with the "create lots of anonymous
loggers" test. My major observation was that creating that many weak
references and rebuilding the internal data structures from scratch
repeatedly can bring a system to its knees. This is why the
weakReferencesProcessed variable exists - we don't rebuild the
internal data structures with every additional logger that we lose.

We also ran jtreg, which didn't seem to have any problems.

The doc fixes: Our secret goal was to sneak those in without having to
file a separate bug, but I guess you caught us. ;)

Jeremy

On Mon, Jun 14, 2010 at 9:46 AM, Daniel D. Daugherty
Post by Daniel D. Daugherty
Post by Martin Buchholz
On Fri, Jun 11, 2010 at 14:46, Daniel D. Daugherty
Post by Daniel D. Daugherty
Jeremy,
I'm definitely interested in learning about your approach to this issue.
Here's the patch against openjdk6 by Jeremy.
http://cr.openjdk.java.net/~martin/WeakLogger-jeremymanson.patch
(It would take a bit of merging to port to openjdk7)
Feel free to take anything from our change.
Apologies for the perforce-isms.
Martin
Jeremy and Martin,
- This changeset is private to Google right now, correct? As in
 it hasn't made it into OpenJDK6 yet.
- Do you plan on pushing this changeset to OpenJDK6?
- What kind of testing has been done on it?
Thanks for the offer for the code. I'll start wading through the
diffs today. Because this is an escalated issue, I will likely
be taking just the code and comments directly related to the
problem at hand. The JavaDoc fixes, even though they are useful,
will have to wait for a different changeset.
Thanks for jumping in on this thread.
Dan
Daniel D. Daugherty
2010-06-14 17:45:22 UTC
Permalink
Post by Jeremy Manson
Daniel,
The fix hasn't made it to OpenJDK6. We were planning on pushing it to
OpenJDK6/7, but we haven't had time for it yet. If your fix is better
(I haven't had a chance to look at it), then we'll happily drop ours
in favor of yours.
I will be looking at your fix today. Mine is a brute force big hammer
style fix that I'm not fond of, but does the job. Your fix will likely
address Eamonn's review comments and also solve the potential performance
impact that mine would have in systems with lots of Loggers.
Post by Jeremy Manson
For testing: I hand tested it with the "create lots of anonymous
loggers" test. My major observation was that creating that many weak
references and rebuilding the internal data structures from scratch
repeatedly can bring a system to its knees. This is why the
weakReferencesProcessed variable exists - we don't rebuild the
internal data structures with every additional logger that we lose.
I'll be sure to keep my eyes open for that part of the fix.
Post by Jeremy Manson
We also ran jtreg, which didn't seem to have any problems.
There are only 10 or so tests in JTREG. The VM/NSK logging test suite
has 550+ tests and it caught my breakage due to the currently legal
Logger loop in one of the tests.
Post by Jeremy Manson
The doc fixes: Our secret goal was to sneak those in without having to
file a separate bug, but I guess you caught us. ;)
Perhaps I can sweep those up for you. We'll have to see how it goes.

Thanks again for contributing the fix!

Dan
Post by Jeremy Manson
Jeremy
On Mon, Jun 14, 2010 at 9:46 AM, Daniel D. Daugherty
Post by Daniel D. Daugherty
Post by Martin Buchholz
On Fri, Jun 11, 2010 at 14:46, Daniel D. Daugherty
Post by Daniel D. Daugherty
Jeremy,
I'm definitely interested in learning about your approach to this issue.
Here's the patch against openjdk6 by Jeremy.
http://cr.openjdk.java.net/~martin/WeakLogger-jeremymanson.patch
(It would take a bit of merging to port to openjdk7)
Feel free to take anything from our change.
Apologies for the perforce-isms.
Martin
Jeremy and Martin,
- This changeset is private to Google right now, correct? As in
it hasn't made it into OpenJDK6 yet.
- Do you plan on pushing this changeset to OpenJDK6?
- What kind of testing has been done on it?
Thanks for the offer for the code. I'll start wading through the
diffs today. Because this is an escalated issue, I will likely
be taking just the code and comments directly related to the
problem at hand. The JavaDoc fixes, even though they are useful,
will have to wait for a different changeset.
Thanks for jumping in on this thread.
Dan
Jeremy Manson
2010-06-14 18:36:11 UTC
Permalink
Daniel,

We're happy to contribute. Like you, we had a customer complaint,
which is why this happened.

My suspicion is that we don't have access to the VM/NSK test suite.
Feel free to run the patch against it.

Jeremy

On Mon, Jun 14, 2010 at 10:45 AM, Daniel D. Daugherty
Post by Daniel D. Daugherty
Post by Jeremy Manson
Daniel,
The fix hasn't made it to OpenJDK6.  We were planning on pushing it to
OpenJDK6/7, but we haven't had time for it yet.  If your fix is better
(I haven't had a chance to look at it), then we'll happily drop ours
in favor of yours.
I will be looking at your fix today. Mine is a brute force big hammer
style fix that I'm not fond of, but does the job. Your fix will likely
address Eamonn's review comments and also solve the potential performance
impact that mine would have in systems with lots of Loggers.
Post by Jeremy Manson
For testing: I hand tested it with the "create lots of anonymous
loggers" test.  My major observation was that creating that many weak
references and rebuilding the internal data structures from scratch
repeatedly can bring a system to its knees.  This is why the
weakReferencesProcessed variable exists - we don't rebuild the
internal data structures with every additional logger that we lose.
I'll be sure to keep my eyes open for that part of the fix.
Post by Jeremy Manson
We also ran jtreg, which didn't seem to have any problems.
There are only 10 or so tests in JTREG. The VM/NSK logging test suite
has 550+ tests and it caught my breakage due to the currently legal
Logger loop in one of the tests.
Post by Jeremy Manson
The doc fixes: Our secret goal was to sneak those in without having to
file a separate bug, but I guess you caught us. ;)
Perhaps I can sweep those up for you. We'll have to see how it goes.
Thanks again for contributing the fix!
Dan
Post by Jeremy Manson
Jeremy
On Mon, Jun 14, 2010 at 9:46 AM, Daniel D. Daugherty
Post by Daniel D. Daugherty
Post by Martin Buchholz
On Fri, Jun 11, 2010 at 14:46, Daniel D. Daugherty
Post by Daniel D. Daugherty
Jeremy,
I'm definitely interested in learning about your approach to this issue.
Here's the patch against openjdk6 by Jeremy.
http://cr.openjdk.java.net/~martin/WeakLogger-jeremymanson.patch
(It would take a bit of merging to port to openjdk7)
Feel free to take anything from our change.
Apologies for the perforce-isms.
Martin
Jeremy and Martin,
- This changeset is private to Google right now, correct? As in
 it hasn't made it into OpenJDK6 yet.
- Do you plan on pushing this changeset to OpenJDK6?
- What kind of testing has been done on it?
Thanks for the offer for the code. I'll start wading through the
diffs today. Because this is an escalated issue, I will likely
be taking just the code and comments directly related to the
problem at hand. The JavaDoc fixes, even though they are useful,
will have to wait for a different changeset.
Thanks for jumping in on this thread.
Dan
Daniel D. Daugherty
2010-06-14 23:20:41 UTC
Permalink
Post by Jeremy Manson
Daniel,
We're happy to contribute. Like you, we had a customer complaint,
which is why this happened.
And I see that you did this work against an earlier bug. I've
made myself the RE for 6931561 and I've update the evaluation
to indicate that I'm likely to close it as a dup of the escalated
bug (6942989).
Post by Jeremy Manson
My suspicion is that we don't have access to the VM/NSK test suite.
You're right. I think that is currently an internal test suite.
Post by Jeremy Manson
Feel free to run the patch against it.
I've just finished my code review of the patch and I have some
issues with it. In the interests of making forward progress on my
escalation, I'm going to investigate creating a hybrid solution
of some of your changes, some of Eamonn's ideas about creating
WeakReference subclasses, and Tony P's ideas in the following
http://java.sun.com/developer/technicalArticles/javase/finalization/

In particular, Tony's article points out that a strong reference
to the WeakReference subclass needs to be held to make sure that
the WeakReference subclass object is still around when we want to
process it off the ReferenceQueue. Of course, that assumes I
understood what Tony said :-)

Dan
Post by Jeremy Manson
Jeremy
On Mon, Jun 14, 2010 at 10:45 AM, Daniel D. Daugherty
Post by Daniel D. Daugherty
Post by Jeremy Manson
Daniel,
The fix hasn't made it to OpenJDK6. We were planning on pushing it to
OpenJDK6/7, but we haven't had time for it yet. If your fix is better
(I haven't had a chance to look at it), then we'll happily drop ours
in favor of yours.
I will be looking at your fix today. Mine is a brute force big hammer
style fix that I'm not fond of, but does the job. Your fix will likely
address Eamonn's review comments and also solve the potential performance
impact that mine would have in systems with lots of Loggers.
Post by Jeremy Manson
For testing: I hand tested it with the "create lots of anonymous
loggers" test. My major observation was that creating that many weak
references and rebuilding the internal data structures from scratch
repeatedly can bring a system to its knees. This is why the
weakReferencesProcessed variable exists - we don't rebuild the
internal data structures with every additional logger that we lose.
I'll be sure to keep my eyes open for that part of the fix.
Post by Jeremy Manson
We also ran jtreg, which didn't seem to have any problems.
There are only 10 or so tests in JTREG. The VM/NSK logging test suite
has 550+ tests and it caught my breakage due to the currently legal
Logger loop in one of the tests.
Post by Jeremy Manson
The doc fixes: Our secret goal was to sneak those in without having to
file a separate bug, but I guess you caught us. ;)
Perhaps I can sweep those up for you. We'll have to see how it goes.
Thanks again for contributing the fix!
Dan
Post by Jeremy Manson
Jeremy
On Mon, Jun 14, 2010 at 9:46 AM, Daniel D. Daugherty
Post by Daniel D. Daugherty
Post by Martin Buchholz
On Fri, Jun 11, 2010 at 14:46, Daniel D. Daugherty
Post by Daniel D. Daugherty
Jeremy,
I'm definitely interested in learning about your approach to this issue.
Here's the patch against openjdk6 by Jeremy.
http://cr.openjdk.java.net/~martin/WeakLogger-jeremymanson.patch
(It would take a bit of merging to port to openjdk7)
Feel free to take anything from our change.
Apologies for the perforce-isms.
Martin
Jeremy and Martin,
- This changeset is private to Google right now, correct? As in
it hasn't made it into OpenJDK6 yet.
- Do you plan on pushing this changeset to OpenJDK6?
- What kind of testing has been done on it?
Thanks for the offer for the code. I'll start wading through the
diffs today. Because this is an escalated issue, I will likely
be taking just the code and comments directly related to the
problem at hand. The JavaDoc fixes, even though they are useful,
will have to wait for a different changeset.
Thanks for jumping in on this thread.
Dan
Jeremy Manson
2010-06-15 20:55:22 UTC
Permalink
Okay. It sounds as if the changes were helpful, anyway. I'll be
interested to see what you do.

Jeremy

On Mon, Jun 14, 2010 at 4:20 PM, Daniel D. Daugherty
Post by Daniel D. Daugherty
Post by Jeremy Manson
Daniel,
We're happy to contribute.  Like you, we had a customer complaint,
which is why this happened.
And I see that you did this work against an earlier bug. I've
made myself the RE for 6931561 and I've update the evaluation
to indicate that I'm likely to close it as a dup of the escalated
bug (6942989).
Post by Jeremy Manson
My suspicion is that we don't have access to the VM/NSK test suite.
You're right. I think that is currently an internal test suite.
Post by Jeremy Manson
Feel free to run the patch against it.
I've just finished my code review of the patch and I have some
issues with it. In the interests of making forward progress on my
escalation, I'm going to investigate creating a hybrid solution
of some of your changes, some of Eamonn's ideas about creating
WeakReference subclasses, and Tony P's ideas in the following
http://java.sun.com/developer/technicalArticles/javase/finalization/
In particular, Tony's article points out that a strong reference
to the WeakReference subclass needs to be held to make sure that
the WeakReference subclass object is still around when we want to
process it off the ReferenceQueue. Of course, that assumes I
understood what Tony said :-)
Dan
Post by Jeremy Manson
Jeremy
On Mon, Jun 14, 2010 at 10:45 AM, Daniel D. Daugherty
Post by Daniel D. Daugherty
Post by Jeremy Manson
Daniel,
The fix hasn't made it to OpenJDK6.  We were planning on pushing it to
OpenJDK6/7, but we haven't had time for it yet.  If your fix is better
(I haven't had a chance to look at it), then we'll happily drop ours
in favor of yours.
I will be looking at your fix today. Mine is a brute force big hammer
style fix that I'm not fond of, but does the job. Your fix will likely
address Eamonn's review comments and also solve the potential performance
impact that mine would have in systems with lots of Loggers.
Post by Jeremy Manson
For testing: I hand tested it with the "create lots of anonymous
loggers" test.  My major observation was that creating that many weak
references and rebuilding the internal data structures from scratch
repeatedly can bring a system to its knees.  This is why the
weakReferencesProcessed variable exists - we don't rebuild the
internal data structures with every additional logger that we lose.
I'll be sure to keep my eyes open for that part of the fix.
Post by Jeremy Manson
We also ran jtreg, which didn't seem to have any problems.
There are only 10 or so tests in JTREG. The VM/NSK logging test suite
has 550+ tests and it caught my breakage due to the currently legal
Logger loop in one of the tests.
Post by Jeremy Manson
The doc fixes: Our secret goal was to sneak those in without having to
file a separate bug, but I guess you caught us. ;)
Perhaps I can sweep those up for you. We'll have to see how it goes.
Thanks again for contributing the fix!
Dan
Post by Jeremy Manson
Jeremy
On Mon, Jun 14, 2010 at 9:46 AM, Daniel D. Daugherty
Post by Daniel D. Daugherty
Post by Martin Buchholz
On Fri, Jun 11, 2010 at 14:46, Daniel D. Daugherty
Post by Daniel D. Daugherty
Jeremy,
I'm definitely interested in learning about your approach to this issue.
Here's the patch against openjdk6 by Jeremy.
http://cr.openjdk.java.net/~martin/WeakLogger-jeremymanson.patch
(It would take a bit of merging to port to openjdk7)
Feel free to take anything from our change.
Apologies for the perforce-isms.
Martin
Jeremy and Martin,
- This changeset is private to Google right now, correct? As in
 it hasn't made it into OpenJDK6 yet.
- Do you plan on pushing this changeset to OpenJDK6?
- What kind of testing has been done on it?
Thanks for the offer for the code. I'll start wading through the
diffs today. Because this is an escalated issue, I will likely
be taking just the code and comments directly related to the
problem at hand. The JavaDoc fixes, even though they are useful,
will have to wait for a different changeset.
Thanks for jumping in on this thread.
Dan
Loading...