I know how to generate a .ptx
file from a .cu
and how to generate a .cubin
file from a .ptx.
But I don't know how to get the final executable.
More specifically, I have a sample.cu
file, which is compiled to sample.ptx
. I then use nvcc to compile sample.ptx
to sample.cubin
. However, this .cubin
file cannot be directly executed without host code. How can I link .cubin
file to my original .cu
file to produce the final executable?
You should be able to run ptx code directly from the cuda driver api with cuModuleLoadDataEx. There is an example here at page 5