Helping the garbage collector . . . not
Because allocation and garbage collection at one time imposed significant performance costs on Java programs, many clever tricks were developed to reduce these costs, such as object pooling and nulling. Unfortunately, in many cases these techniques can do more harm than good to your program's performance.
Object pooling is a straightforward concept -- maintain a pool of frequently used objects and grab one from the pool instead of creating a new one whenever needed. The theory is that pooling spreads out the allocation costs over many more uses. When the object creation cost is high, such as with database connections or threads, or the pooled object represents a limited and costly resource, such as with database connections, this makes sense. However, the number of situations where these conditions apply is fairly small.
In addition, object pooling has some serious downsides. Because the object pool is generally shared across all threads, allocation from the object pool can be a synchronization bottleneck. Pooling also forces you to manage deallocation explicitly, which reintroduces the risks of dangling pointers. Also, the pool size must be properly tuned to get the desired performance result. If it is too small, it will not prevent allocation; and if it is too large, resources that could get reclaimed will instead sit idle in the pool. By tying up memory that could be reclaimed, the use of object pools places additional pressure on the garbage collector. Writing an effective pool implementation is not simple.
In his "Performance Myths Exposed" talk at JavaOne 2003 (see Resources), Dr. Cliff Click offered concrete benchmarking data showing that object pooling is a performance loss for all but the most heavyweight objects on modern JVMs. Add in the serialization of allocation and the dangling-pointer risks, and it's clear that pooling should be avoided in all but the most extreme cases.
Explicit nulling is simply the practice of setting reference objects to null when you are finished with them. The idea behind nulling is that it assists the garbage collector by making objects unreachable earlier. Or at least that's the theory.
There is one
case where the use of explicit nulling is not only helpful, but
virtually required, and that is where a reference to an object is
scoped more broadly than it is used or considered valid by the
program's specification. This includes cases such as using a static or
instance field to store a reference to a temporary buffer, rather than
a local variable (see Resources
for a link to "Eye on performance: Referencing objects" for an
example), or using an array to store references that may remain
reachable by the runtime but not by the implied semantics of the
program. Consider the class in Listing 3, which is an implementation of
a simple bounded stack backed by an array. When
called, without the explicit nulling in the example, the class could
cause a memory leak (more properly called "unintentional object
retention," or sometimes called "object loitering") because the
reference stored in
stack[top+1] is no longer reachable by the program, but still considered reachable by the garbage collector.
In the September 1997 "Java Developer Connection Tech Tips" column (see Resources), Sun warned of this risk and explained how explicit nulling was needed in cases like the
example above. Unfortunately, programmers often take this advice too
far, using explicit nulling in the hope of helping the garbage
collector. But in most cases, it doesn't help the garbage collector at
all, and in some cases, it can actually hurt your program's performance.
Consider the code in Listing 4, which combines several really bad ideas. The listing is a linked list implementation that uses a finalizer to walk the list and null out all the forward links. We've already discussed why finalizers are bad. This case is even worse because now the class is doing extra work, ostensibly to help the garbage collector, but that will not actually help -- and might even hurt. Walking the list takes CPU cycles and will have the effect of visiting all those dead objects and pulling them into the cache -- work that the garbage collector might be able to avoid entirely, because copying collectors do not visit dead objects at all. Nulling the references doesn't help a tracing garbage collector anyway; if the head of the list is unreachable, the rest of the list won't be traced anyway.Listing 4. Combining finalizers and explicit nulling for a total performance disaster -- don't do this!
Explicit nulling should be saved for cases where your program is subverting normal scoping rules for performance reasons, such as the stack example in Listing 3 (a more correct -- but poorly performing -- implementation would be to reallocate and copy the stack array each time it is changed).
Explicit garbage collection
A third category where developers often mistakenly think they are helping the garbage collector is the use of
which triggers a garbage collection (actually, it merely suggests that
this might be a good time for a garbage collection). Unfortunately,
triggers a full collection, which includes tracing all live objects in
the heap and sweeping and compacting the old generation. This can be a
lot of work. In general, it is better to let the system decide when it
needs to collect the heap, and whether or not to do a full collection.
Most of the time, a minor collection will do the job. Worse, calls to
System.gc() are often deeply buried where developers may be unaware of their presence, and where they might get triggered far
more often than necessary. If you are concerned that your application might have hidden calls to
System.gc() buried in libraries, you can invoke the JVM with the
-XX:+DisableExplicitGC option to prevent calls to
System.gc() and triggering a garbage collection.
No installment of Java theory and practice would be complete without some sort of plug for immutability. Making objects immutable eliminates entire classes of programming errors. One of the most common reasons given for not making a class immutable is the belief that doing so would compromise performance. While this is true sometimes, it is often not -- and sometimes the use of immutable objects has significant, and perhaps surprising, performance advantages.
Many objects function as containers for references to other objects. When
the referenced object needs to change, we have two choices: update the
reference (as we would in a mutable container class) or re-create the
container to hold a new reference (as we would in an immutable
container class). Listing 5 shows two ways to implement a simple holder
class. Assuming the containing object is small, which is often the case
(such as a
Map.Entry element in a
Map or a
linked list element), allocating a new immutable object has some hidden
performance advantages that come from the way generational garbage
collectors work, having to do with the relative age of objects.
In most cases, when a holder
object is updated to reference a different object, the new referent is
a young object. If we update a
MutableHolder by calling
setValue(), we have created a situation where an older object references a younger one. On the other hand, by creating a new
object instead, a younger object is referencing an older one. The
latter situation, where most objects point to older objects, is much
more gentle on a generational garbage collector. If a
MutableHolder that lives in the old generation is mutated, all the objects on the card that contain the
MutableHolder must be scanned for old-to-young references
at the next minor collection. The use of mutable references for
long-lived container objects increases the work done to track
old-to-young references at collection time. (See last month's
article and this month's Resources,
which explain the card-marking algorithm used to implement the write
barrier in the generational collector used by current Sun JVMs).