Mutant World

Thursday, December 21, 2006

Java Memory Model: down to the metal

Javapolis is over, and among the sessions that can be seen and heard online, there's Brian Goetz's on the Java Memory Model.

Although I knew (since a while) about the new memory model, I wanted to check out if there was something new. Brian's presentation is quite entertaining (considering the thickness of the argument), but no news with respect to what I already knew.

One thing that always intrigued me, however, is: How is the new memory model actually implemented ?

Before we go into the detail, I will make a small preamble about what the problem is in JDK 1.4, and how it has been fixed in JDK 5.

In JDK 1.4, if two threads are executing the now deprecated double checked locking code (see below), there is no guarantee that the singleton is created only once (it may be created twice) or - hard to believe but true - that a fully constructed singleton is returned (a partially constructed singleton may be returned):

public class Singleton
{
private static Object singleton;
public static Object getInstance()
{
if (singleton == null)
{
synchronized (Singleton.class)
{
if (singleton == null)
singleton = new Object();
}
}
return singleton;
}
}

How is it possible that the double checked locking does not work ?

It is possible because in JDK 1.4 the memory model, i.e. the interaction between threads and memory, was not well defined.
In a multi-processor machine it was perfectly legal for the first processor, running the first thread, to cache the value of the data member singleton (for example in a registry) so that the second processor, running the second thread, would have read the stale value of null, even if the first thread already updated it.

Fortunately, in JDK 5 the memory model has been fixed, so that now the effects of synchronized, volatile and final with respect to the memory model have been precisely defined.

These effects are outlined in Brian's presentation, and defined here so you may want to refer to these resources for further details.
The short story is that those keywords now impose memory barriers, i.e. they tell the processors to flush the caches to main memory, or to invalidate the caches and therefore read fresh data from main memory.

In JDK 1.4, the synchronized keyword does not guarantee a memory barrier when the lock is released. In JDK 5 the memory barrier is guaranteed. (Now, you may think that in JDK 5 the double checked locking code above works fine, but it does not - not yet).

Now that we are done with the preamble, let's go back to my original question: how are these memory barriers implemented in the JVM ?

It turns out operative systems do not have primitives (system calls) that handle memory barriers, but processors do. So the JVM goes down to the metal, from C++ to assembler, and squeezes in few assembler instructions to tell the processor to perform a memory barrier.
For example to implement a particular memory barrier called fence, in a x86 the assembler instruction is lock addl 0,(sp) where sp is the stack pointer, while in a ia64 there is a dedicated assembler instruction called mf (memory fence).

Conclusion: Once again I am thankful that the JVM takes care of these details for me, although it's quite funny to figure them out :D