Search code examples
c++multithreadingcpuhpcslurm

SLURM C++ sees more cores available than assigned


I am trying to run a single process multithreading job on a SLURM managed HPC cluster. I intend to use multi-cores for my thread.

When I allocate the resources to HPC, I use the command:

#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8

This should allocate 8 CPUs to one process on the same machine, right?

However, when I try to detect the number of cores available with the following code:

#include <iostream>
#include <thread>

int main() {
    unsigned int n = std::thread::hardware_concurrency();
    std::cout << n << " concurrent threads are supported.\n";
}

It outputs:

32 concurrent threads are supported.

This is weird, as I expect it to output 8 concurrent threads are supported. I suspect that, despite the fact that SLURM only allocated 8 CPUs to the task, the machine has a total of 32 CPUs.

However, some packages I use relies on the hardware_concurrency command to obtain the amount of CPUs. Therefore, this might cause some package to overload the system with too many threads.

  1. Any idea why?
  2. Do you think my account will be charged 32 CPU clocks for the job, instead of 8?
  3. Should I limit the number of threads in my application to the number of cores I allocated (8), instead of the number of cores detected by C++ (32), to achieve maximum efficiency?
  4. Do you know any c++ code that reports the correct amount of CPU available (not the total number of CPUs in the machine) allocated by SLURM?

Solution

  • Even if a package relies on the hardware concurrency, usually it is to get a default value for the number of threads. Most likely it also provides a way for you to set the desired value yourself. If that is the case, then you can get the number of CPUs allocated to your job from slurm using environment variables. In your particular case the environment variable is SLURM_CPUS_PER_TASK.

    You can use std::getenv to get the value of an environment variable. It returns a char * and you need something such as std::atoi to convert it to an int.

    #include <iostream>
    #include <thread>
    #include <cstdlib>
    
    int main() {
        unsigned int n = std::thread::hardware_concurrency();
        std::cout << n << " concurrent threads are supported.\n";
    
        std::cout << "CPUS_PER_TASK: " << std::atoi(std::getenv("SLURM_CPUS_PER_TASK")) << std::endl;
    }
    

    If you do not do that, then the C++ program will create 32 threads, but slurm should still limit your job to 8 cores. Thus, each thread will use only about 25% of a CPU.