Why the compiled binary gets smaller when -gencode used?
My GPU's capability is 3.0.
NVCC option:
Without -gencode
option:
1,780,520 bytes
-gencode=arch=compute_30,code=sm_30
:
1,719,080 bytes, gets smaller
-gencode=arch=compute_30,code=sm_30 -gencode=arch=compute_61,code=sm_61
:
1,780,800 bytes
Nvidia documentation tells that:
Example:
nvcc x.cu
is equivalent to:
nvcc x.cu --gpu-architecture=compute_30 --gpu-code=sm_30,compute_30
but in your case:
nvcc x.cu -gencode=arch=compute_30,code=sm_30
is equivalent to:
nvcc x.cu --gpu-architecture=compute_30 --gpu-code=sm_30
which does not include the PTX code for the virtual architecture (such as compute_30)