Quest to a more efficient LockSet (volatile field, semaphore, reentrant lock)

While I am staying up late, I recall an article about anti-sleeping pill that I read a couple of weeks ago. Somehow, I tend to have a much clearer mind late at night (I mean after I awake for many hours :-P). It explains why many of my blog was written late at night.

The feeling that I can stay up as late as I want in a Sunday is very good. A long weekend means that I can have two really long late nights for my own. Labor Day wasn't only an extra weekend day in a week, but also doubling my productive time to my project.

And, I spent it on coding a small part of the cluster cache project: the quest to a more efficient LockSet. It actually started while I was driving 100 miles north Friday. I was driving alone and turned off the stereo in my car to think about the problem. After about an hour running thru my 4-steps synchronized block LockSet in my head, I realized the only way to get an more efficient (in term of how much synchronization I need to do) lock set requires the ability to enter a semaphore before leaving the other.

I read the pseudocode of lock from Gray/Reuter book. The sequence looks something like that:
1/ semaphore get on the data structure for the lock set (ie, spinlock S),
2/ find the node represent the lock, or create one if not exists
3/ semaphore get on the node (ie, spinlock N)
4/ semaphore give the lock set (ie, unlock S)
5/ acquire the lock (may block the thread)

It sounds simple, but it cannot be done “efficiently” with Java’s synchronized blocks.

Spinlock (compare and store) is a primitive pillar in concurrent world that it cannot be reduced. Of course, you can simulate a spinlock using Java synchronized block. But, synchronized block is itself built by Spinlock.

Java 1.5 provided a set of new concurrent utilities. I know that it has a lock interface that would allow me to do what the Gray/Reuter lock implementation did. So, I dig deep into it.

After digging deep into the code, I found ReentrantLock is itself pretty expensive. Semaphore is a bit lighter, because it doesn’t maintain a linked list for thread, but it still maintaining more state than I need. But, in the process, I found what I want, the spinlock.

It is exposed to the API thru (AbstractQueuedSynchronizer/AtomicXYZ.compareAndSetState(int, int)). The “spin-unlock” is achieved by AbstractQueuedSynchronizer.setState(). The javadoc didn’t mention the memory model constraints. However, the way Semaphore implemented using those methods implies that compareAndSetState() and setState() acts as a read-barrier and a write-barrier respectively.

I would expect setState() to call a method in the other class that declare native. To my surprise, it simply sets the volatile field, that compareAndSetState() set using a native method.

Why it reqiures setting a volatile field only? I remember volatile as read/write-ordering on that specific field. But, a write-barrier is a different guarantee.

It was because I rarely find a useful case for volatile field. Because the guarantee was on the field only, you cannot use it to guard another data. While I aware JMM changes in Java 5, and followed the mailing list for a few good months, I didn’t pay much attention to volatile field changes.

Brian Goetz explained it very well with his article in developerWorks.

The semantic of volatile field in Java 5 is updated to have memory boundary guarantee.

Now, after this quest to a more efficient lock set, I gained the understanding of not only how to implement it efficiently, but also what was missing in older version of Java, why JMM change is needed, way better understanding of the JMM itself.

It is a good feeling that what I need is already created by someone else before I need it.

Now, I think I know JMM very well, come challenge me with tough questions! :-P

Tag: concurrency