'Flush records'-Warning in Parallel Nsight profiling results

I'm trying to profile my CUDA-Kernels running on a Windows 7 32 bit machine with a NVIDIA GTX 480 board. I'm using the CUDA 4.1 32 bit toolkit and the Parallel Nsight 2.1 edition for VS 2010.

The profiling results of my program always show the same warning on an irregular basis: Message: Flush records, Event Type: Range, Level: 50

After this event there is always a processing break of several milliseconds. Then the GPU proceeds the computing at the speed it had before.

I havn't found any information about this warning in CUDA documentation and on the web and I don't even know if it is a problem that only occours during profiling.

Has anyone an idea what this warning is about and how to avoid it?

Solution

The warning "Flush Record" is used to show when the Nsight CUDA Trace Activity is adding additional overhead to your application. This is to allow you to interpret periods of high CPU activity. There is no way to remove this warning. Your application is not doing anything wrong.

The Nsight CUDA Trace Activity collects timestamps for the start and end of GPU work including kernels launches, memory copies, and memory sets. When an application launches a task on the GPU the tool allocates a trace record for the task and programs the GPU to write a time stamp into the record. The collection of timestamps is done in a way that should not break concurrency and should not stall the CPU. When the work is completed the tools collects the information and streams it to memory. The Flush range includes the time to collect the results and write out the information. This can include time to perform additional kernel launches and copy memory from device to host. The tool will collect results when the application synchronizes a context (cuCtxSynchronize or cuda{Thread, Device}Synchronize) or when it runs out of trace records.

I will enter a bug to improve the user documentation and tool tips.