Search code examples
ccudaptx

Passing the PTX program to the CUDA driver directly


The CUDA driver API provides loading the file containing PTX code from the filesystem. One usually does the following:

CUmodule module;
CUfunction function;

const char* module_file = "my_prg.ptx";
const char* kernel_name = "vector_add";

err = cuModuleLoad(&module, module_file);
err = cuModuleGetFunction(&function, module, kernel_name);

In case one generates the PTX files during runtime (on the fly) going through file IO seems to be a waste (since the driver has to load it back in again).

Is there a way to pass the PTX program to the CUDA driver directly (e.g. as a C string) ?


Solution

  • Taken from the ptxjit CUDA example:

    Define the PTX program as a C string as

    char myPtx32[] = "\n\
        .version 1.4\n\
        .target sm_10, map_f64_to_f32\n\
        .entry _Z8myKernelPi (\n\.param .u32 __cudaparm__Z8myKernelPi_data)\n\
        {\n\
        .reg .u16 %rh<4>;\n\
        .reg .u32 %r<8>;\n\
    
        // Other stuff
    
        .loc    28      18      0\n\
        exit;\n\
        }\n\
     ";
    

    then

     cuModuleLoadDataEx(phModule, myPtx32, 0, 0, 0);
    

    and finally

     cuModuleLoadDataEx(phModule, myPtx, 0, 0, 0);