How expensive is allocation?
The 1.0 and 1.1 JDKs used a mark-sweep collector, which did compaction on some -- but not all -- collections, meaning that the heap might be fragmented after a garbage collection. Accordingly, memory allocation costs in the 1.0 and 1.1 JVMs were comparable to that in C or C++, where the allocator uses heuristics such as "first-first" or "best-fit" to manage the free heap space. Deallocation costs were also high, since the mark-sweep collector had to sweep the entire heap at every collection. No wonder we were advised to go easy on the allocator.
In HotSpot JVMs (Sun JDK 1.2 and later), things got a lot better -- the Sun JDKs moved to a generational collector. Because a copying collector is used for the young generation, the free space in the heap is always contiguous so that allocation of a new object from the heap can be done through a simple pointer addition, as shown in Listing 1. This makes object allocation in Java applications significantly cheaper than it is in C, a possibility that many developers at first have difficulty imagining. Similarly, because copying collectors do not visit dead objects, a heap with a large number of temporary objects, which is a common situation in Java applications, costs very little to collect; simply trace and copy the live objects to a survivor space and reclaim the entire heap in one fell swoop. No free lists, no block coalescing, no compacting -- just wipe the heap clean and start over. So both allocation and deallocation costs per object went way down in JDK 1.2.Listing 1. Fast allocation in a contiguous heap
Performance advice often has a short shelf life; while it was once true that allocation was expensive, it is now no longer the case. In fact, it is downright cheap, and with a few very compute-intensive exceptions, performance considerations are generally no longer a good reason to avoid allocation. Sun estimates allocation costs at approximately ten machine instructions. That's pretty much free -- certainly no reason to complicate the structure of your program or incur additional maintenance risks for the sake of eliminating a few object creations.
Of course, allocation is only half the story -- most objects that are allocated are eventually garbage collected, which also has costs. But there's good news there, too. The vast majority of objects in most Java applications become garbage before the next collection. The cost of a minor garbage collection is proportional to the number of live objects in the young generation, not the number of objects allocated since the last collection. Because so few young generation objects survive to the next collection, the amortized cost of collection per allocation is fairly small (and can be made even smaller by simply increasing the heap size, subject to the availability of enough memory).
But wait, it gets better
The JIT compiler can perform additional optimizations that can reduce the
cost of object allocation to zero. Consider the code in Listing 2,
getPosition() method creates a temporary object to hold the coordinates of a point, and the calling method uses the
Point object briefly and then discards it. The JIT will likely inline the call to
getPosition() and, using a technique
called escape analysis, can recognize that no reference to the
Point object leaves the
method. Knowing this, the JIT can then allocate the object on the stack
instead of the heap or, even better, optimize the allocation away
completely and simply hoist the fields of the Point into
registers. While the current Sun JVMs do not yet perform this
optimization, future JVMs probably will. The fact that allocation can
get even cheaper in the future, with no changes to your code, is just
one more reason not to compromise the correctness or maintainability of
your program for the sake of avoiding a few extra allocations.
Isn't the allocator a scalability bottleneck?
Listing 1 shows that while allocation itself is fast, access to the heap structure must be synchronized across threads. So doesn't that make the allocator a scalability hazard? There are several clever tricks JVMs use to reduce this cost significantly. IBM JVMs use a technique called thread-local heaps, by which each thread requests a small block of memory (on the order of 1K) from the allocator, and small object allocations are satisfied out of that block. If the program requests a larger block than can be satisfied using the small thread-local heap, then the global allocator is used to either satisfy the request directly or to allocate a new thread-local heap. By this technique, a large percentage of allocations can be satisfied without contending for the shared heap lock. (Sun JVMs use a similar technique, instead using the term "Local Allocation Blocks.")