I have a function which is very parallelized in OpenMP, when launched from a simple console executable it saturates every core of the machine and returns the result linearly faster in the number of processors.
void updateStateWithAParallelAlgorithm()
{
#pragma omp parallel for
{
// do parallel things, update positions of particles in a physics simulation
}
}
Now this function is also used inside a QThread in my Qt program. The problem is that I have to update screen positions of the particles every after a call of updateStateWithAParallelAlgorithm()
function.
When launched inside my Qt main program, I see no improvement in speed of the algorithm and the 8 cores of my processor are not saturated.
I would rather believe that I should see a peak-pause behaviour on the graph of CPU usage, but this doesn't happen.
Now, I'm giving you more informations.
class MyComputationThread : public QThread
{
Q_OBJECT
// some methods
// some variables
void doComputation()
{
this->setPriority(QThread::HighestPriority);
#ifdef Q_WS_X11
int s;
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(1, &cpuset);
s = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
if (s != 0) {
perror("pthread_getaffinity_np");
}
#endif
updateStateWithAParallelAlgorithm();
}
}
I would like to understand how my thread MyComputationThread
class can exploit the multicore, without being constraint to only one CPU as in the statement of pthread_set_affinity_np
.
According to the pthread_setaffinity_np(3)
manual page:
A new thread created by
pthread_create(3)
inherits a copy of its creator's CPU affinity mask.
You are limiting the particular MyComputationThread
instance to run on a single core only and thus are also limiting all threads spawned by the OpenMP run-time to also run on the same core. You should either remove the call to pthread_setaffinity_np()
or move the call to updateStateWithAParallelAlgorithm()
before the part that sets the affinity.