c linux multithreading pthreads affinity

Pinning a large number of threads to a single CPU causes utilization spike on all cores

I've written a little test program that spawns a large number of threads (in my case 32 threads on a computer with 4 cores) and pins them all to one core with the pthread_setaffinity_np syscall.

These threads run in a loop in which they report the result of the sched_getcpu call via stdout and then sleep for a short time. What I wanted to see is, how strictly the OS adheres to a user's thread pinning settings (even if they don't make sense as in my case). All threads report to be running on the core I've pinned them to, which is what I would have expected.

However, I've noticed, that while the program is running, cpu utilization on all 4 cores is around 100% (normally it's between 0% and 25%). Could someone enlighten me as to why this would be the case? I would have expected the utilization on the pinned core to be maximal with it being maybe a little higher on the other cores to compensate.

I can append my code if necessary, but I figured it's pretty straightforward and thus not really necessary. I did the test on a fairly old PC with Ubuntu 18.04.

Update

#define _GNU_SOURCE

#include <assert.h>
#include <pthread.h>
#include <sched.h>
#include <stdio.h>
#include <unistd.h>

#define THREADS 32
#define PINNED 3
#define MINUTE 60
#define MILLISEC 1000

void thread_main(int id);

int main(int argc, char** argv) {
    int i;
    pthread_t pthreads[THREADS];

    printf("%d threads will be pinned to cpu %d\n", THREADS, PINNED);

    for (i = 0; i < THREADS; ++i) {
        pthread_create(&pthreads[i], NULL, &thread_main, i);
    }

    sleep(MINUTE);

    return 0;
}

void thread_main(int id) {
    printf("thread %d: inititally running on cpu %d\n", id, sched_getcpu());

    pthread_t pthread = pthread_self();
    cpu_set_t cpu_set;

    CPU_ZERO(&cpu_set);
    CPU_SET(PINNED, &cpu_set);

    assert(0 == pthread_setaffinity_np(pthread, sizeof(cpu_set_t), &cpu_set));

    while (1) {
        printf("thread %d: running on cpu %d\n", id, sched_getcpu());
        //usleep(MILLISEC);
    }
}

When I close all background activity utilization is not quite 100%, but definitely affects all 4 cores to a significant degree.

Solution

@caf

If you're running these in a pseudo-terminal, then another process is receiving all of > that printf output and processing it, which requires CPU time as well. That process (your terminal, likely also Xorg) is going to show up heavily in profiles. Consider > that graphically rendering that text output is going to be far more CPU-intensive than the printf() that generates it. Try running your test process with output redirected to /dev/null.

This is the correct answer, thanks.

With the output directed to /dev/null the CPU usage spikes are restricted to the CPU that has all the threads pinned to it.