Search code examples
cudalinker-errorsstatic-linkingcufft

cuFFT static linking failed


I tried to link cuFFT statically.

nvcc -ccbin g++ -dc -O3 -arch=sm_35  -c fftStat.cu fftStat.o;
nvcc -ccbin g++ -dlink -arch=sm_35 fftStat.o -o link.o;
g++ main.cc link.o fftStat.o -lcudart -lcudadevrt -lcufft_static   -lculibos -ldl -pthread -lrt -L/usr/local/cuda-10.2/lib64 -o run

It gave me the following errors ( not showing all the errors)

/usr/local/cuda-10.2/lib64/libcufft_static.a(fft_dimension_class_multi.o): In function `__sti____cudaRegisterAll()':
fft_dimension_class_multi.compute_75.cudafe1.cpp:(.text+0xdad): undefined reference to `__cudaRegisterLinkedBinary_44_fft_dimension_class_multi_compute_75_cpp1_ii_466e44ab'
/usr/local/cuda-10.2/lib64/libcufft_static.a(fft_dimension_class_multi.o): In function `global constructors keyed to BaseListMulti::radices':
fft_dimension_class_multi.compute_75.cudafe1.cpp:(.text+0x1c8d): undefined reference to 
float_64bit_regular_RT_SM50_plus.compute_75.cudafe1.cpp:(.text+0x3d): undefined reference to `__cudaRegisterLinkedBinary_51_float_64bit_regular_RT_SM50_plus_compute_75_cpp1_ii_66731515'
/usr/local/cuda-10.2/lib64/libcufft_static.a(float_64bit_regular_RT_SM50_plus.o): In function `global constructors keyed to compile_unitsforce_compile_float_width64_t_regular_fft_kernels__SM50_unbounded()':
float_64bit_regular_RT_SM50_plus.compute_75.cudafe1.cpp:(.text+0x29d): undefined reference to `__cudaRegisterLinkedBinary_51_float_64bit_regular_RT_SM50_plus_compute_75_cpp1_ii_66731515'
/usr/local/cuda-10.2/lib64/libcufft_static.a(float_64bit_regular_RT_SM60_plus.o): In function `__sti____cudaRegisterAll()':
float_64bit_regular_RT_SM60_plus.compute_75.cudafe1.cpp:(.text+0x3d): undefined reference to `__cudaRegisterLinkedBinary_51_float_64bit_regular_RT_SM60_plus_compute_75_cpp1_ii_dbb979db'
/usr/local/cuda-10.2/lib64/libcufft_static.a(float_64bit_regular_RT_SM60_plus.o): In function `global constructors keyed to compile_unitsforce_compile_float_width64_t_regular_fft_kernels__SM60_unbounded()':
float_64bit_regular_RT_SM60_plus.compute_75.cudafe1.cpp:(.text+0x18d): undefined reference to `__cudaRegisterLinkedBinary_51_float_64bit_regular_RT_SM60_plus_compute_75_cpp1_ii_dbb979db'
/usr/local/cuda-10.2/lib64/libcufft_static.a(half_32bit_regular_RT_SM53_plus.o): In function `__sti____cudaRegisterAll()':
half_32bit_regular_RT_SM53_plus.compute_75.cudafe1.cpp:(.text+0x3d): undefined reference to `__cudaRegisterLinkedBinary_50_half_32bit_regular_RT_SM53_plus_compute_75_cpp1_ii_96a57339'
/usr/local/cuda-10.2/lib64/libcufft_static.a(half_32bit_regular_RT_SM53_plus.o): In function `global constructors keyed to compile_unitsforce_compile_half_width32_t_regular_fft_kernels__SM53_unbounded()':
half_32bit_regular_RT_SM53_plus.compute_75.cudafe1.cpp:(.text+0x1b0d): undefined reference to `__cudaRegisterLinkedBinary_50_half_32bit_regular_RT_SM53_plus_compute_75_cpp1_ii_96a57339'
/usr/local/cuda-10.2/lib64/libcufft_static.a(half_32bit_vector_RT_SM53_plus.o): In function `__sti____cudaRegisterAll()':
half_32bit_vector_RT_SM53_plus.compute_75.cudafe1.cpp:(.text+0x3d): undefined reference to 
dpRadix0343C_cb.compute_75.cudafe1.cpp:(.text+0xa54): undefined reference to `__cudaRegisterLinkedBinary_34_dpRadix0343C_cb_compute_75_cpp1_ii_b592a056'
collect2: error: ld returned 1 exit status

Dynamic linking works:

g++ main.cc link.o fftStat.o -lcudart -lcudadevrt -lcufft -L/usr/local/cuda-10.2/lib64 -o run

I followed this guide https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#code-changes-for-separate-compilation and this guide https://docs.nvidia.com/cuda/cufft/index.html#static-library but apparently something is missing.


Solution

  • Some of the things you are attempting to accomplish at final link need to be accomplished at device link (your 2nd step). The following seems to work for me:

    $ cat fftStat.cu
    #include <cufft.h>
    
    void test(){
    
      cufftHandle h;
      cufftCreate(&h);
    }
    
    $ cat main.cpp
    void test();
    
    int main(){
    
      test();
    }
    
    $ nvcc -ccbin g++ -dc -O3 -arch=sm_35  -c fftStat.cu fftStat.o
    $ nvcc -ccbin g++ -dlink -arch=sm_35 fftStat.o -o link.o -lcufft_static -lcudadevrt
    $ g++ main.cpp link.o fftStat.o -L/usr/local/cuda-10.2/lib64   -lcufft_static -lcudart -lcudadevrt -lculibos -ldl -pthread -lrt  -o run
    

    Note that I've also rearranged some link orders to account for link dependencies. This may or may not matter depending on your exact version of g++. Some of the needs here (e.g. -lcudadevrt at the device-link step) may be a function of your actual code, which you haven't shown. For the above code, that item is not actually necessary.