I am playing around with multi-treading in Java (Sun JDK 1.7 64 bit), trying to grasp some concepts better. What I find puzzling is finding the size of a thread pool for the executor and impact of that setting on performance. Here is my basic code:
public class Program {
static int bestThreads = 0;
static long bestTime = Integer.MAX_VALUE;
public static void main(String[] args) throws InterruptedException, ExecutionException {
int cores = Runtime.getRuntime().availableProcessors();
for (int sizeOfPool = 1; sizeOfPool <= cores; sizeOfPool++) {
ExecutorService exec = Executors.newFixedThreadPool(sizeOfPool);
//System.out.println("Started");
int noOftasks = 1000;
for (int i = 0; i < noOftasks; i++) {
Calculator c = new Calculator();
exec.submit(c);
}
long start = System.currentTimeMillis();
exec.shutdown();
exec.awaitTermination(1000, TimeUnit.DAYS);
long stop = (System.currentTimeMillis() - start);
//System.out.println("Done " + noOftasks + " tasks in " + stop + " on " + sizeOfPool + " threads");
if (bestTime > stop) {
bestTime = stop;
bestThreads = sizeOfPool;
}
}
System.out.println("Best size of pool " + bestThreads + " result in " + bestTime + " ms");
}
public static class Calculator implements Runnable {
@Override
public void run() {
doJob();
}
}
//Can be whatever this just gives me a few milliseconds worth of CPU load since I don't want to use Thread.sleep()
public static void doJob() {
for (int j = 0; j < 1E3; j++) {
Math.round(Math.sin(Math.sqrt(Math.random())));
}
}
When I run this program I get that the setting that used the smallest amount of time was the one using N threads where N is usually 2 (Meaning I should use 2 threads for the size of my thread pool). I don't get why this happens because number of processors I get from .availableProcessors() is 4 (I am using i3 with multi-threading, it's on a laptop and Windows shows that all threads are active when running the program). Also I usually get different results when I change the amount of work done with:
1E1 -> N=4
1E2 -> N=3 or 2
1E3 -> N=2
1E4 -> N=2
but even then in most cases I get N=2;
Can somebody please explain why do I get results like this and what is usually suggested pool size depending on CPU that the program is ran on.
Here is a little bit more output that I find strange:
Done 1000 tasks in 195 on 1 threads //Ok it takes around 200ms for this processor to do this, over-clocking it will help here I imagine
Done 1000 tasks in 134 on 2 threads //I know I can't get 2x increase because of context switching and some other impacts of thread creation overhead but this is a nice speedup
Done 1000 tasks in 138 on 3 threads //Almost same as 2 threads, why is it not worse or better
Done 1000 tasks in 210 on 4 threads //Worse then 1 thread, this is the one I really don't get
Your "test" job is completely CPU bound, that means it depends only on the CPU/core speed. While the i3 claims to have 4 cores, its a dual core CPU (2 cores with 2 threads each - aka hyperthreading).
Hyperthreading does not give you 4 full cores, each core works on either of its two threads (it switches automatically, e.g. when a thread waits for a memory access). So in your test case the i3 CPU performs best with two threads, since that is the maximum your CPU can handle (truely) simultaneously.
With a different test (e.g. with lots of memory accesses, or waiting for I/O) you would get different "ideal" thread numbers.
Edit: There is no way I know of to distinguish between a real "physical" core and a "virtual" core in java. Newer AMD CPU's have their own quirks in that regard (separate cores, but FPU shared between 2 cores), so its really very low-level technology dependent. To really get all the details you probably would need to read the CPU-Id and check the data-sheet for that CPU.
The reason you get sometime 2 and sometimes 3 is probably due to multithreaded tests not being truely deterministic (the operation system will inevitably eat some CPU at random times). Also, short duration test show often a lot of variation in java due to the JIT warming up (look for microbenchmarking, its a complicated topic).
You should see a difference between i3 / i7 regardless.