Here I wrote a test about access speed of local, member, volatile member:
public class VolatileTest {
public int member = -100;
public volatile int volatileMember = -100;
public static void main(String[] args) {
int testloop = 10;
for (int i = 1; i <= testloop; i++) {
System.out.println("Round:" + i);
VolatileTest vt = new VolatileTest();
vt.runTest();
System.out.println();
}
}
public void runTest() {
int local = -100;
int loop = 1;
int loop2 = Integer.MAX_VALUE;
long startTime;
startTime = System.currentTimeMillis();
for (int i = 0; i < loop; i++) {
for (int j = 0; j < loop2; j++) {
}
for (int j = 0; j < loop2; j++) {
}
}
System.out.println("Empty:" + (System.currentTimeMillis() - startTime));
startTime = System.currentTimeMillis();
for (int i = 0; i < loop; i++) {
for (int j = 0; j < loop2; j++) {
local++;
}
for (int j = 0; j < loop2; j++) {
local--;
}
}
System.out.println("Local:" + (System.currentTimeMillis() - startTime));
startTime = System.currentTimeMillis();
for (int i = 0; i < loop; i++) {
for (int j = 0; j < loop2; j++) {
member++;
}
for (int j = 0; j < loop2; j++) {
member--;
}
}
System.out.println("Member:" + (System.currentTimeMillis() - startTime));
startTime = System.currentTimeMillis();
for (int i = 0; i < loop; i++) {
for (int j = 0; j < loop2; j++) {
volatileMember++;
}
for (int j = 0; j < loop2; j++) {
volatileMember--;
}
}
System.out.println("VMember:" + (System.currentTimeMillis() - startTime));
}
}
And here is a result on my X220 (I5 CPU):
Round:1 Empty:5 Local:10 Member:312 VMember:33378
Round:2 Empty:31 Local:0 Member:294 VMember:33180
Round:3 Empty:0 Local:0 Member:306 VMember:33085
Round:4 Empty:0 Local:0 Member:300 VMember:33066
Round:5 Empty:0 Local:0 Member:303 VMember:33078
Round:6 Empty:0 Local:0 Member:299 VMember:33398
Round:7 Empty:0 Local:0 Member:305 VMember:33139
Round:8 Empty:0 Local:0 Member:307 VMember:33490
Round:9 Empty:0 Local:0 Member:350 VMember:35291
Round:10 Empty:0 Local:0 Member:332 VMember:33838
It surprised me that access to volatile member is 100 times slower than normal member. I know there is some highlight feature about volatile member, such as a modification to it will be visible for all thread immediately, access point to volatile variable plays a role of "memory barrier". But can all these side effect be the main cause of 100 times slow?
PS: I also did a test on a Core II CPU machine. It is about 9:50, about 5 times slow. seems like this is also related to CPU arch. 5 times is still big, right?
Acess to volatile
prevents some JIT optimisaton. This is especially important if you have a loop which doesn't really do anything as the JIT can optimise such loops away (unless you have a volatile field) If you run the loops "long" the descrepancy should increase more.
In more realistic test, you might expect volatile
to take between 30% and 10x slower for cirtical code. In most real programs it makes very little difference because the CPU is smart enough to "realise" that only one core is using the volatile field and cache it rather than using main memory.