I've written a little test program that spawns a large number of threads (in my case 32 threads on a computer with 4 cores) and pins them all to one core with the pthread_setaffinity_np
syscall.
These threads run in a loop in which they report the result of the sched_getcpu
call via stdout and then sleep for a short time. What I wanted to see is, how strictly the OS adheres to a user's thread pinning settings (even if they don't make sense as in my case).
All threads report to be running on the core I've pinned them to, which is what I would have expected.
However, I've noticed, that while the program is running, cpu utilization on all 4 cores is around 100% (normally it's between 0% and 25%). Could someone enlighten me as to why this would be the case? I would have expected the utilization on the pinned core to be maximal with it being maybe a little higher on the other cores to compensate.
I can append my code if necessary, but I figured it's pretty straightforward and thus not really necessary. I did the test on a fairly old PC with Ubuntu 18.04.
#define _GNU_SOURCE
#include <assert.h>
#include <pthread.h>
#include <sched.h>
#include <stdio.h>
#include <unistd.h>
#define THREADS 32
#define PINNED 3
#define MINUTE 60
#define MILLISEC 1000
void thread_main(int id);
int main(int argc, char** argv) {
int i;
pthread_t pthreads[THREADS];
printf("%d threads will be pinned to cpu %d\n", THREADS, PINNED);
for (i = 0; i < THREADS; ++i) {
pthread_create(&pthreads[i], NULL, &thread_main, i);
}
sleep(MINUTE);
return 0;
}
void thread_main(int id) {
printf("thread %d: inititally running on cpu %d\n", id, sched_getcpu());
pthread_t pthread = pthread_self();
cpu_set_t cpu_set;
CPU_ZERO(&cpu_set);
CPU_SET(PINNED, &cpu_set);
assert(0 == pthread_setaffinity_np(pthread, sizeof(cpu_set_t), &cpu_set));
while (1) {
printf("thread %d: running on cpu %d\n", id, sched_getcpu());
//usleep(MILLISEC);
}
}
When I close all background activity utilization is not quite 100%, but definitely affects all 4 cores to a significant degree.
@caf
If you're running these in a pseudo-terminal, then another process is receiving all of > that printf output and processing it, which requires CPU time as well. That process (your terminal, likely also Xorg) is going to show up heavily in profiles. Consider > that graphically rendering that text output is going to be far more CPU-intensive than the printf() that generates it. Try running your test process with output redirected to /dev/null.
This is the correct answer, thanks.
With the output directed to /dev/null
the CPU usage spikes are restricted to the CPU that has all the threads pinned to it.