Search code examples
javaperformancevolatile

Why access volatile variable is about 100 slower than member?


Here I wrote a test about access speed of local, member, volatile member:

public class VolatileTest {

public int member = -100;

public volatile int volatileMember = -100;

public static void main(String[] args) {
    int testloop = 10;
    for (int i = 1; i <= testloop; i++) {
        System.out.println("Round:" + i);
        VolatileTest vt = new VolatileTest();
        vt.runTest();
        System.out.println();
    }
}

public void runTest() {
    int local = -100;

    int loop = 1;
    int loop2 = Integer.MAX_VALUE;
    long startTime;

    startTime = System.currentTimeMillis();
    for (int i = 0; i < loop; i++) {
        for (int j = 0; j < loop2; j++) {
        }
        for (int j = 0; j < loop2; j++) {
        }
    }
    System.out.println("Empty:" + (System.currentTimeMillis() - startTime));

    startTime = System.currentTimeMillis();
    for (int i = 0; i < loop; i++) {
        for (int j = 0; j < loop2; j++) {
            local++;
        }
        for (int j = 0; j < loop2; j++) {
            local--;
        }
    }
    System.out.println("Local:" + (System.currentTimeMillis() - startTime));

    startTime = System.currentTimeMillis();
    for (int i = 0; i < loop; i++) {
        for (int j = 0; j < loop2; j++) {
            member++;
        }
        for (int j = 0; j < loop2; j++) {
            member--;
        }
    }
    System.out.println("Member:" + (System.currentTimeMillis() - startTime));

    startTime = System.currentTimeMillis();
    for (int i = 0; i < loop; i++) {
        for (int j = 0; j < loop2; j++) {
            volatileMember++;
        }
        for (int j = 0; j < loop2; j++) {
            volatileMember--;
        }
    }
    System.out.println("VMember:" + (System.currentTimeMillis() - startTime));

}
}

And here is a result on my X220 (I5 CPU):

Round:1 Empty:5 Local:10 Member:312 VMember:33378

Round:2 Empty:31 Local:0 Member:294 VMember:33180

Round:3 Empty:0 Local:0 Member:306 VMember:33085

Round:4 Empty:0 Local:0 Member:300 VMember:33066

Round:5 Empty:0 Local:0 Member:303 VMember:33078

Round:6 Empty:0 Local:0 Member:299 VMember:33398

Round:7 Empty:0 Local:0 Member:305 VMember:33139

Round:8 Empty:0 Local:0 Member:307 VMember:33490

Round:9 Empty:0 Local:0 Member:350 VMember:35291

Round:10 Empty:0 Local:0 Member:332 VMember:33838

It surprised me that access to volatile member is 100 times slower than normal member. I know there is some highlight feature about volatile member, such as a modification to it will be visible for all thread immediately, access point to volatile variable plays a role of "memory barrier". But can all these side effect be the main cause of 100 times slow?

PS: I also did a test on a Core II CPU machine. It is about 9:50, about 5 times slow. seems like this is also related to CPU arch. 5 times is still big, right?


Solution

  • Acess to volatile prevents some JIT optimisaton. This is especially important if you have a loop which doesn't really do anything as the JIT can optimise such loops away (unless you have a volatile field) If you run the loops "long" the descrepancy should increase more.

    In more realistic test, you might expect volatile to take between 30% and 10x slower for cirtical code. In most real programs it makes very little difference because the CPU is smart enough to "realise" that only one core is using the volatile field and cache it rather than using main memory.