Search code examples
cudaprofilingtracedata-synchronizationnvprof

My CUDA nvprof 'API Trace' and 'GPU Trace' are not synchronized - what to do?


I'm using the CUDA 7.0 profiler, nvprof, to profile some process making CUDA calls:

$ nvprof -o out.nvprof /path/to/my/app

Later, I generate two traces: the 'API trace' (what happens on the host CPU, e.g. CUDA runtime calls and ranges you mark) and the 'GPU trace' (kernel executions, memsets, H2Ds, D2Hs and so on):

$ nvprof -i out.nvprof --print-api-trace --csv 2>&1 | tail -n +2 > api-trace.csv
$ nvprof -i out.nvprof --print-gpu-trace --csv 2>&1 | tail -n +2 > gpu-trace.csv

Every record in each of the traces has a timestamp (or a start and end time). The thing is, time value 0 in these two traces is not the same: The GPU trace time-0 point seems to signify when the first operation on the GPU triggered by the relevant process begins to execute, while the API trace's time-0 point seems to be the beginning of process execution, or sometime thereabouts.

I've also noticed that when I use nvvp and import out.nvprof, the values are corrected, that it to say, the start time of the first GPU op is not 0, but something more realistic.

How do I obtain the correct offset between the two traces?


Solution

  • It may not be obvious from the nvprof documentation, but it is possible to specify both --print-gpu-trace and --print-api-trace when requesting output from nvprof, whether you are profiling an app or extracting information from a previously captured profiler output file.

    If you are profiling an app, the following should generate a "harmonized" timeline for both API activity and GPU activity:

    nvprof --print-gpu-trace --print-api-trace ./my_app
    

    You can save the output using the --log-file option.

    Similarly, if you are extracting output from a previously captured output file (not the same thing as a log file), you can do something like the following:

    nvprof -i profiler_out_file --print-gpu-trace --print-api-trace ...
    

    where profiler_out_file should be the name of the file you previously saved using the nvprof -o ... option.

    Printing both traces with the same command is essential here for the two (combined) timelines to begin at the same point in time; if you issue two commands, each printing another trace, they may not be thus 'harmonized'.