Search code examples
cudaprofilingopenclnvidiagpu-shared-memory

Size of statically allocated shared memory per block with Compute Prof (Cuda/OpenCL)


In Nvidia's compute prof there is a column called "static private mem per work group" and the tooltip of it says "Size of statically allocated shared memory per block". My application shows that I am getting 64 (bytes I assume) per block. Does that mean I am using somewhere between 1-64 of those bytes or is the profiler just telling me that this amount of shared memory was allocated and who knows if it was used at all?


Solution

  • If it's allocated, it's probably because you used it. AFAIK CUDA passes parameters to kernels via shared memory, so it's must be that.