I am developing a C++
application in Qt
.
I have a very basic doubt, please forgive me if this is too stupid...
How many threads should I create to divide a task amongst them for minimum time?
I am asking this because my laptop is 3rd gen i5 processor (3210m). So since it is dual core & NO_OF_PROCESSORS
environment variable is showing me 4. I had read in an article that dynamic memory for an application is only available for that processor which launched that application. So should I create only 1 thread (since env variable says 4 processors) or 2 threads (since my processor is dual core & env variable might be suggesting the no of cores) or 4 threads (if that article was wrong)?
Please forgive me since I am a beginner level programmer trying to learn Qt.
Thank You :)
NO_OF_PROCESSORS
shows 4 because your CPU has Hyper-threading. Hyper-threading is the Intel trademark for tech that enables a single core to execute 2 threads of the same application more or less at the same time. It work as long as e.g. one thread is fetching data and the other one accessing the ALU. If both need the same resource and instructions can't be reordered, one thread will stall. This is the reason you see 4 cores, even though you have 2.
That dynamic memory is only available to one of the Cores is IMO not quite right, but register contents and sometimes cache content is. Everything that resides in the RAM should be available to all CPUs.
More threads than CPUs can help, depending on how you operating systems scheduler works / how you access data etc. To find that you'll have to benchmark your code. Everything else will just be guesswork.
Apart from that, if you're trying to learn Qt, this is maybe not the right thing to worry about...
Edit:
Answering your question: We can't really tell you how much slower/faster your program will run if you increase the number of threads. Depending on what you are doing this will change. If you are e.g. waiting for responses from the network you could increase the number of threads much more. If your threads are all using the same hardware 4 threads might not perform better than 1. The best way is to simply benchmark your code.
In an ideal world, if you are 'just' crunching numbers should not make a difference if you have 4 or 8 threads running, the net time should be the same (neglecting time for context switches etc.) just the response time will differ. The thing is that nothing is ideal, we have caches, your CPUs all access the same memory over the same bus, so in the end they compete for access to resources. Then you also have an operating system that might or might not schedule a thread/process at a given time.
You also asked for an Explanation of synchronization overhead: If all your threads access the same data structures, you will have to do some locking etc. so that no thread accesses the data in an invalid state while it is being updated.
Assume you have two threads, both doing the same thing:
int sum = 0; // global variable
thread() {
int i = sum;
i += 1;
sum = i;
}
If you start two threads doing this at the same time, you can not reliably predict the output: It might happen like this:
THREAD A : i = sum; // i = 0
i += 1; // i = 1
**context switch**
THREAD B : i = sum; // i = 0
i += 1; // i = 1
sum = i; // sum = 1
**context switch**
THREAD A : sum = i; // sum = 1
In the end sum
is 1
, not 2
even though you started the thread twice.
To avoid this you have to synchronize access to sum
, the shared data. Normally you would do this by blocking access to sum
as long as needed. Synchronization overhead is the time that threads would be waiting until the resource is unlocked again, doing nothing.
If you have discrete work packages for each thread and no shared resources you should have no synchronization overhead.