According to the documentation for event/summary mode of nvprof
, the output looks like:
==6461== Profiling application: matrixMul
==6461== Profiling result:
==6461== Event result:
//The outputs
==6461== Metric result:
//The outputs
The default should show the latencies, percentages, etc for API calls and kernels under Profiling result
. So there are two questions:
Profiling Result
?nvprof
to output Profiling Result
also?Why isn't any ouput under Profiling Result?
According to the documentation, it states:
nvprof
operates in one of the modes listed below.
Those modes are:
Your excerpted info is from 3.1.3 Event/metric Summary Mode. When you are in this mode you are not in any of the other modes, and the data collection (and output) description for the other modes does not apply.
How do I get
nvprof
to outputProfiling Result
also?
If you want to capture metric info on a per-kernel basis, use 3.1.4 Event/metric Trace Mode. Output will then appear in the Profiling Result
section.
For other combinations, it's not possible to get nvprof
to display an arbitrary collection of profiling data in a single run. If you require output that is only available in a particular mode, you will need to run in that mode to get that output. You may need to run nvprof
multiple times to get all the output info or data that you'd like to collect. nvvp
(the visual profiler) does this (i.e. will run nvprof
multiple times, under the hood) in order to display a greater range of data for a given application view.