On attempting to use nvprof
to profile my program, I receive the following output with no other information:
<program output>
======== Warning: No profile data collected.
The code used follows this classic first cuda program. I have had nvprof work on my system before, however I recently had to re-install cuda.
I have attempted to follow the suggestions in this post which suggested to include cudaDeviceReset()
and cudaProfilerStart/Stop()
and to use some extra profiling flags nvprof --unified-memory-profiling off
without luck.
This nvidia developer forum post seems to run into a similar error, however the suggestions here seemed to indicate needing to use a different compiler than nvcc
due to some OpenACC library I do not use.
For completeness, I have included my program code, though I imagine it has more to due with my system:
nvcc add.cu -o add_cuda
nvprof ./add_cuda
#include <iostream>
#include <math.h>
#include <cuda_profiler_api.h>
// function to add the elements of two arrays
void add(int n, float *x, float *y)
for (int i = 0; i < n; i++)
y[i] = x[i] + y[i];
int main(void)
int N = 1<<20; // 1M elements
// Allocate Unified Memory -- accessible from CPU or GPU
float *x, *y;
cudaMallocManaged(&x, N*sizeof(float));
cudaMallocManaged(&y, N*sizeof(float));
// initialize x and y arrays on the host
for (int i = 0; i < N; i++) {
x[i] = 1.0f;
y[i] = 2.0f;
// Run kernel on 1M elements on the GPU
add<<<1, 1>>>(N, x, y);
// Wait for GPU to finish before accessing on host
// Check for errors (all values should be 3.0f)
float maxError = 0.0f;
for (int i = 0; i < N; i++)
maxError = fmax(maxError, fabs(y[i]-3.0f));
std::cout << "Max error: " << maxError << std::endl;
// Free memory
return 0;
How can I resolve this to get actual profiling information using nvprof
As per the documentation, there is currently no profiling support in CUDA for WSL. This is why there is no profiling data collected when you are using nvprof.