I'm confused with the NVCC documentation: 3.2.7. Options for Steering GPU Code Generation
What's the difference between
nvcc -arch=compute_50 -code=sm_50,compute_50
(equivalent to nvcc -arch=sm_50
)
and
nvcc -arch=compute_50 -code=sm_50
This:
nvcc -arch=compute_50 -code=sm_50,compute_50 (equivalent to nvcc -arch=sm_50)
embeds both PTX and SASS into your fatbinary. The inclusion of PTX into your fatbinary makes it more likely that your code will run on future/higher than cc 5.0 architectures.
This:
nvcc -arch=compute_50 -code=sm_50
embeds only SASS. The code will run only on an architecture that is binary compatible with cc5.0