Search code examples
javamultithreadinghibernatejpavolatile

volatile barrier in Hibernate source code would "syncs state with other threads". How?


I was digging inside the source code of hibernate-jpa today and stumbled upon the following code snippet (that you can also find here):

private static class PersistenceProviderResolverPerClassLoader implements PersistenceProviderResolver {

    //FIXME use a ConcurrentHashMap with weak entry
    private final WeakHashMap<ClassLoader, PersistenceProviderResolver> resolvers =
            new WeakHashMap<ClassLoader, PersistenceProviderResolver>();
    private volatile short barrier = 1;

    /**
     * {@inheritDoc}
     */
    public List<PersistenceProvider> getPersistenceProviders() {
        ClassLoader cl = getContextualClassLoader();
        if ( barrier == 1 ) {} //read barrier syncs state with other threads
        PersistenceProviderResolver currentResolver = resolvers.get( cl );
        if ( currentResolver == null ) {
            currentResolver = new CachingPersistenceProviderResolver( cl );
            resolvers.put( cl, currentResolver );
            barrier = 1;
        }
        return currentResolver.getPersistenceProviders();
    }

That weird statement if ( barrier == 1 ) {} //read barrier syncs state with other threads disturbed me. I took the time to dig into the volatile keyword specification.

To put it simply, in my understanding, it ensures that any READ or WRITE operation on the corresponding variable will allways be performed directly in the memory at the place the value is usually stored. It specifically prevents accesses through caches or registrars that hold a copy of the value and are not necessarily aware if the value has changed or is being modified by a concurrent thread on another core.

As a consequence it causes a drop in performances because every access implies to go all the way into the memory instead of using the usual (pipelined?) shortcuts. But it also ensures that whenever a thread reads the variable it will always be up to date.

I provided those details to let you know what my understanding of the keyword is. But now when I re-read the code I am telling myself "Ok wo we are slowing the execution by ensuring that a value which is always 1 is always 1 (and setting it to 1). How does that help?"

Anybody can explain this?


Solution

  • You understand volatile wrong.

    it ensures that any READ or WRITE operation on the corresponding variable will allways be performed directly in the memory at the place the value is usually stored. It specifically prevents accesses through caches or registrars that hold a copy of the value and are not necessarily aware if the value has changed or is being modified by a concurrent thread on another core.

    You are talking about the implemention, while the implemention may differs from jvm to jvm.


    volatile is much like some kind of specification or rule, it can gurantee that

    Write to a volatile variable establishes a happens-before relationship with subsequent reads of that same variable. This means that changes to a volatile variable are always visible to other threads. What's more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change.

    and

    Using simple atomic variable access is more efficient than accessing these variables through synchronized code, but requires more care by the programmer to avoid memory consistency errors. Whether the extra effort is worthwhile depends on the size and complexity of the application.


    In this case, volatile is not used to gurantte barrier == 1:

    if ( barrier == 1 ) {} //read
    PersistenceProviderResolver currentResolver = resolvers.get( cl );
    if ( currentResolver == null ) {
        currentResolver = new CachingPersistenceProviderResolver( cl );
        resolvers.put( cl, currentResolver );
        barrier = 1; //write
    }
    

    it is used to gurantee that the side effects between the read and write is visible to other threads.

    Without it, if you put something in the resolvers in Thread1, Thread2 might not notice it.

    With it, if Thread2 read barrier after Thread1 write it, Thread2 is gurantted to see this put action.


    And, there are many other synchronization mechanism, such as:

    • synchronized keyword
    • ReentrantLock
    • AtomicInteger
    • ....

    Usually, they can also build this happens-before relation ship between different threads.