Search code examples
javamultithreadingjlsjava-memory-model

Non-volatile fields + first object access from another thread (java)


I have been working on a certain server-type application for a while now, and I found that its design challenges the way I see memory coherence (so to speak) in Java.


This application uses NIO, therefore there is a limited amount of I/O threads (they only do network I/O and nothing else; they never terminate, but may get blocked waiting for more work).

Each connection is internally represented as an object of a specific type, let's call it ClientCon for the sake of this example. ClientCon has various session related fields, none of which are volatile. There is no synchronization of any kind in relation to getting/setting values for these fields.

Received data is made up of logical units with a fixed maximum size. Each such unit has some metadata that allows the handling type (class) to be decided. Once that is done, a new object of that type is created. All such handlers have fields, none of which are volatile. An I/O thread (a concrete I/O thread is assigned to each ClientCon) then calls a protected read method with remaining buffer contents (after metadata was read) on the new handler object.

After this, the same handler object is put into a special queue, which (the queue) is then submitted to a thread pool for execution (where each handler's run method is called to take actions based on the read data). For the sake of this example, we can say that TP threads never terminate.

Therefore, a TP thread will get its hands on an object it never had access to before. All fields of that object are non-volatile (and most/all are non-final, as they were modified outside the constructor).

The handler's run method may act based on session-specific fields in ClientCon as well as set them and/or act on handler object's own fields, whose values were set in the read method.


According to CPJ (Concurrent Programming in Java: Design and Principles):

The first time a thread accesses a field of an object, it sees either the initial value of the field or a value since written by some other thread.

A more comprehensive example of this quote can be found in JLS 17.5:

class FinalFieldExample { 
    final int x;
    int y; 
    static FinalFieldExample f;

    public FinalFieldExample() {
        x = 3; 
        y = 4; 
    } 

    static void writer() {
        f = new FinalFieldExample();
    } 

    static void reader() {
        if (f != null) {
            int i = f.x;  // guaranteed to see 3  
            int j = f.y;  // could see 0
        } 
    } 
}

The class FinalFieldExample has a final int field x and a non-final int field y. One thread might execute the method writer and another might execute the method reader.

Because the writer method writes f after the object's constructor finishes, the reader method will be guaranteed to see the properly initialized value for f.x: it will read the value 3. However, f.y is not final; the reader method is therefore not guaranteed to see the value 4 for it.


This application has been running on x86 (and x86/64) Windows/Unix OSes (Linux flavors, Solaris) for years now (both Sun/Oracle and OpenJDK JVMs, versions 1.5 to 8) and apparently there have been no memory coherency issues related to received data handling. Why?


To sum it up, is there a way for a TP thread to see the object as it was initialized after construction and be unable to see all or some changes done by an I/O thread when it called the protected read method? If so, it would be nice if a detailed example could be presented.

Otherwise, are there some side-effects that could cause the object's field values to always be visible in other threads (e.g. I/O thread acquiring a monitor when adding the handler object to a queue)? Neither the I/O thread nor the TP thread synchronizes on the handler object itself. The queue does no such thing as well (not that it would make sense, anyway). Is this related to a concrete JVM's implementation details, perhaps?



EDIT:

It follows from the above definitions that:

An unlock on a monitor happens-before every subsequent lock on that monitor. – Not applicable: monitor is not acquired on the handler object

A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field. – Not applicable: no volatile fields

A call to start() on a thread happens-before any actions in the started thread. – A TP thread might already exist when the queue with handler object(s) is submitted for execution. A new handler object might be added to queue amidst an execution on an existing TP thread.

All actions in a thread happen-before any other thread successfully returns from a join() on that thread. – Not applicable: threads do not wait for each other

The default initialization of any object happens-before any other actions (other than default-writes) of a program. – Not applicable: field writes are after default init AND after constructor finishes

When a program contains two conflicting accesses (§17.4.1) that are not ordered by a happens-before relationship, it is said to contain a data race.

and

Memory that can be shared between threads is called shared memory or heap memory.

All instance fields, static fields, and array elements are stored in heap memory. In this chapter, we use the term variable to refer to both fields and array elements.

Local variables (§14.4), formal method parameters (§8.4.1), and exception handler parameters (§14.20) are never shared between threads and are unaffected by the memory model.

Two accesses to (reads of or writes to) the same variable are said to be conflicting if at least one of the accesses is a write.

There was a write without forcing a HB relationship on field(s), and later there is a read, once again, not forcing a HB relationship on those field(s). Or am I horribly wrong here? That is, there is no declaration that anything about the object could have changed, so why would the JVM force-flush possibly cached values for these fields?


TL;DR

Thread #1 writes values to a new object's fields in a way that does not allow JVM to know that those values should be propagated to other threads.

Thread #2 acquires the object that was modified after construction by Thread #1 and reads those field values.

Why does the issue described in FinalFieldExample/JLS 17.5 NEVER happen in practice?

Why does Thread #2 never see only a default-initialized object (or, alternatively, the object as it was after construction, but before/in the mid of field value changes by Thread #1)?


Solution

  • It might depend on what type of thread pool you are using. If it's an ExecutorService, then that class makes some strong guarantees about its task. From the documentation:

    Memory consistency effects: Actions in a thread prior to the submission of a Runnable or Callable task to an ExecutorService happen-before any actions taken by that task, which in turn happen-before the result is retrieved via Future.get().

    So when you initialize any object, plus any other objects, then submit that object to an ExecutorService, all those writes are made visible to the thread that will eventually handle your task.

    Now, if you home-rolled your own thread pool, or you're using a thread pool with out these guarantees, then all bets are off. I'd say switch to something that has the guarantee though.