Search code examples
cudagpunvidianvcc

NVCC -arch -code


I'm confused with the NVCC documentation: 3.2.7. Options for Steering GPU Code Generation

What's the difference between

nvcc -arch=compute_50 -code=sm_50,compute_50 (equivalent to nvcc -arch=sm_50)

and

nvcc -arch=compute_50 -code=sm_50


Solution

  • This:

    nvcc -arch=compute_50 -code=sm_50,compute_50 (equivalent to nvcc -arch=sm_50)
    

    embeds both PTX and SASS into your fatbinary. The inclusion of PTX into your fatbinary makes it more likely that your code will run on future/higher than cc 5.0 architectures.

    This:

    nvcc -arch=compute_50 -code=sm_50
    

    embeds only SASS. The code will run only on an architecture that is binary compatible with cc5.0

    More info is here and here.