Search code examples
cudagpu-atomicscompute-capability

CUDA atomicAdd_block is undefined


According to CUDA Programming Guide, "Atomic functions are only atomic with respect to other operations performed by threads of a particular set ... Block-wide atomics: atomic for all CUDA threads in the current program executing in the same thread block as the current thread. These are suffixed with _block, e.g., atomicAdd_block"

However, I cannot use atomicAdd_block while my code is compiled fine with atomicAdd. Is there any header or library that I should add or link to?


Solution

  • atomicAdd() has been supported for a long time - by earlier versions of CUDA and with older micro-architectures. However, atomicAdd_system() and atomicAdd_block were introduced, IIANM, with the Pascal micro-architecture, in 2016. The minimum Compute Capability in which they are supported is 6.0. If you're targeting CC 5.2 or earlier - or if your CUDA version is several years old - then they might not be available to you.

    This is actually likely to be the case, since even for the current version of CUDA, nvcc will default to Compute Capability 5.2 if no other value is specified with -gencode or -arch (e.g. if you run nvcc -o out my_file.cu).