Search code examples
cudafortranmpiopenaccnvprof

What would cause nvprof to return no data?


I have a Fortran MPI code instrumented with OpenACC. It is a big code. No way I can provide any meaningful snippets here. It runs fine under Cray aprun:

aprun -n 15 ./mycode

I want to profile it with nvprof. I try:

aprun -n 15 -b nvprof ./mycode

The code again runs OK, but when all is said and done, I get no profiling data, just a message:

======== Warning: No CUDA application was profiled, exiting

There is no other error message provided. Anyone have any idea what would cause this behavior? I am compiling with the Cray MPI Fortran compiler. My compile flags are

-Mdaz -traceback -Ktrap=inv -acc -ta=tesla,cuda6.5,cc35,nofma -Minfo=accel -Mcuda=cuda6.5,cc35 -I. -module .

The cudatoolkit module is loaded.


Solution

  • aprun -n 15 -b nvprof --profile-child-processes ./mycode
    

    For cray systems, you run aprun from a login node. aprun launches processes on compute nodes. By default, nvprof will not profile the child processes, so the --profile-child-processes option profiles the spawned processes.