Mutant World: December 2006

Thursday, December 21, 2006

Java Memory Model: down to the metal

Javapolis is over, and among the sessions that can be seen and heard online, there's Brian Goetz's on the Java Memory Model.

Although I knew (since a while) about the new memory model, I wanted to check out if there was something new. Brian's presentation is quite entertaining (considering the thickness of the argument), but no news with respect to what I already knew.

One thing that always intrigued me, however, is: How is the new memory model actually implemented ?

Before we go into the detail, I will make a small preamble about what the problem is in JDK 1.4, and how it has been fixed in JDK 5.

In JDK 1.4, if two threads are executing the now deprecated double checked locking code (see below), there is no guarantee that the singleton is created only once (it may be created twice) or - hard to believe but true - that a fully constructed singleton is returned (a partially constructed singleton may be returned):


public class Singleton
{
    private static Object singleton;
    public static Object getInstance()
    {
        if (singleton == null)
        {
            synchronized (Singleton.class)
            {
                if (singleton == null)
                    singleton = new Object();
            }
        }
        return singleton;
    }
}

How is it possible that the double checked locking does not work ?

It is possible because in JDK 1.4 the memory model, i.e. the interaction between threads and memory, was not well defined.
In a multi-processor machine it was perfectly legal for the first processor, running the first thread, to cache the value of the data member singleton (for example in a registry) so that the second processor, running the second thread, would have read the stale value of null, even if the first thread already updated it.

Fortunately, in JDK 5 the memory model has been fixed, so that now the effects of synchronized, volatile and final with respect to the memory model have been precisely defined.

These effects are outlined in Brian's presentation, and defined here so you may want to refer to these resources for further details.
The short story is that those keywords now impose memory barriers, i.e. they tell the processors to flush the caches to main memory, or to invalidate the caches and therefore read fresh data from main memory.

In JDK 1.4, the synchronized keyword does not guarantee a memory barrier when the lock is released. In JDK 5 the memory barrier is guaranteed. (Now, you may think that in JDK 5 the double checked locking code above works fine, but it does not - not yet).

Now that we are done with the preamble, let's go back to my original question: how are these memory barriers implemented in the JVM ?

It turns out operative systems do not have primitives (system calls) that handle memory barriers, but processors do. So the JVM goes down to the metal, from C++ to assembler, and squeezes in few assembler instructions to tell the processor to perform a memory barrier.
For example to implement a particular memory barrier called fence, in a x86 the assembler instruction is lock addl 0,(sp) where sp is the stack pointer, while in a ia64 there is a dedicated assembler instruction called mf (memory fence).

Conclusion: Once again I am thankful that the JVM takes care of these details for me, although it's quite funny to figure them out :D

Tuesday, December 19, 2006

Java Closures

Java Closures seem to be the buzzword of the moment.
Unfortunately I missed the (I'm told great) JavaPolis conference in Antwerpen, Belgium, but fortunately the JavaPolis guys have put videos of the sessions online !

I recommend all people interested to watch or listen to Neal Gafter's session on closures.

The very interesting idea behind closures in Java (how they are shaped right now, at least) is the ability to write code that somehow "extends" the Java language syntax itself, adding what look like new keywords to the Java language itself.
Neal Gafter himself said that if closures where in the language before JDK 5, the new for loop syntax would probably not have been introduced.

How do closures look like ?

The proposed syntax for the closures in Java allows to write code such as:


Collection<T> elements = ...;
forEach (T element : elements) { doSomething(element); }

The forEach statement here looks like a new keyword of the Java language, but in reality is a method call (assume there is a static import such as import static java.util.Collections.forEach).

Another example is iterating over entries of a java.util.Map:


Map<K, V> props = ...
eachEntry(K key, V value : props) { doSomething(key, value); }

where again the eachEntry statement is a static method call to (for example) java.util.Collections.

One can even think of replacing the synchronized keyword with the classes from java.util.concurrent.locks:


final Lock lock = ...;
sync(lock) { oneThreadAtATime(); }

where sync is a method call, and not a keyword.
It is impressive how closures make sync look like a keyword, when compared to the real synchronized keyword.

Possibilities are endless: from automatically closing streams, to measuring elapsed time, to adding functional programming to collections, to simplify java.util.concurrent.Executor usage, etc.

Closures in Java are targeted for JDK 7. Seems strange to say, with JDK 6 released few days ago, but I cannot wait to see closures in Java.