Locking Strategies

I’ve recently developed a bit of a love affair with the ReentrantReadWriteLock class. This class provides two separate locks; a read lock that can be obtained concurrently by multiple readers, and a write lock that can only have one owner at a time (and prevents read locks from being granted while it is taken). This granularity allows for tweaking your shared state to better deal with a many-readers / few-writers scenario. I decided it might be worth actually measuring the difference, rather than just assuming it would be better. Additionally, there is some syntactic overhead with using locks manually, so I wanted to be sure that the performance benefit was worth the added ugliness. The ugliness I’m refering to is the best-practice of unlocking inside a finally block, in order to ensure that you don’t end up with dead-locked code.

readWriteLock.readLock().lock();
try {
  // Do some stuff 
} finally {
  readWriteLock.readLock().unlock();
}

Having this sprinkled all of the place can become a bit of an eyesore, even if it is necessary. My hypothesis, was that when the number of readers exceeds the number of writers, the reentrant properties of granular locking would cause a significant throughput increase when measured against the single-owner-at-a-time effect of synchronized blocks.

The Test, Take One

For the first pass of the test, I wrote a class that will read some command line options to set the number of reader and writer threads, as well as specifying which lock strategy to use and how long the test should run. When the test completes, the number of read and write operations performed is printed to standard out.

The complete code is available here You can run this test with the following: java -jar YOUR_JAR_FILE -accessmethod locks -duration 30 -readerthreads 13 -writerthreads 3

The accessmethod param can be either ‘locks’ or ‘sync’, and the duration param specifies how many seconds the test should run.

To test my hypothesis, I ran this test with all the combinations of reader and writer threads summing to 16 total threads, and compared the number of completed operations for each one. The results really surprised me:

So what’s going on here? Why is the explicit lock behavior performing so terribly compared to the synchronized blocks? After some investigation, I was able to determine that because of the extremely small amount of work I was doing in the critical sections, the amount of overhead of using the locks was drastically out-weighing the reduced contention - there basically was no contention with such minimal time spent dealing with shared state!

// The read operation
long dataVal = SyncVsGranularLocks.dataBlock[
  counter++ & SyncVsGranularLocks.mask
];
return dataVal;
// ...
// The write operation
SyncVsGranularLocks.dataBlock[counter++ & SyncVsGranularLocks.mask] = counter;

These operations are extremely fast, and not causing any measurable amount of contention!

The Test, Take Two

To get some more meaningful results, I added a single millisecond sleep in each of the critical sections. This is not exactly representative of the behavior of a production system (reads rarely take the same amount of time as writes), but it does help us get a better ballpark of the difference between the two locking strategies.

Running the test again with the sleeps added and using a non-fair strategy with the ReentrantReadWriteLock, we get the following results:

Wow. Well, for a high ratio of readers to writers, it sure is good to be one of the few writers when using locks! Using the non-fair strategy with the ReentrantReadWriteLock ended up seriously starving our reader threads! Running the test again with fairness specified, using 14 reader threads and 2 writer threads (again for 30 seconds), we end up with these results:

Using a fair lock strategy gave us exactly the same number of operations for each reader and writer thread, and far out-performed the synchronization block strategy.

The Test, Take Three

Finally, let’s see what happens when we vary the amount of time we sleep in the read and write operations.

For this series of tests, we use 14 reader threads and 2 writer threads for each test. Additionally, we run this test with fairness set in the locks:

Of note is that while our granular locks outperform the synchronized blocks in all cases, the read performance edges closer to equality as write operations take more and more time.

The code used for this final test can be found here.

This final version adds the ‘readersleep’ and ‘writersleep’ params to adjust how long read operations and write operations will sleep, respectively. Invoke it with: java -jar YOUR_JAR_FILE -accessmethod sync -duration 30 -readerthreads 14 -writerthreads 2 -readersleep 1 -writersleep 2

Analysis

So looking at these test results, we’ve found that for very quick operations, the overhead of using manual locks far outweighs any potential benefit in reduced contention. In situations where the operations do take some amount of time, we can get a large increase in throughput by using granular locking. Finally, we’ve found that for a large number of participants, using a fair locking strategy may be critical to avoid starving some participants! All of this can be summarized as independently validating the following, found in the ReentrantReadWriteLock javadoc

ReentrantReadWriteLocks can be used to improve concurrency in some uses of some kinds of Collections. This is typically worthwhile only when the collections are expected to be large, accessed by more reader threads than writer threads, and entail operations with overhead that outweighs synchronization overhead.

Until next time!