Search code examples
cudansight-compute

How do I analyze register spills with Nsight Compute?


I am having trouble finding out where the data for local memory usage is. Right now, I only know to look for STL instructions in the source. I wish I could find concrete numbers.


Solution

  • The very short answer is apparently that NSight Compute currently doesn’t show local memory spills.

    However:

    • if you are using static compilation with nvcc, you can always see the linker spills to local memory through -Xptxas=“-v” i.e. turn on verbose output from the assembler.
    • if you are using nvrtc and compiling at runtime, you can programmatically obtain the information via the cuFuncGetAttribute API with the CU_FUNC_ATTRIBUTE_LOCAL_SIZE_BYTES attribute if you have a handle to the function
    • If you are using Cupy, kernel objects have the local_size_bytes attribute which is automagically populated after compilation.

    [answer assembled from comments and added as a community wiki entry]