failed using cuda-gdb to launch program with CUPTI calls

I'm having this weird issue: I have a program that uses CUPTI callbackAPI to monitor the kernels in the program. It runs well when it's directly launched; but when I put it under cuda-gdb and run, it failed with the following error:

error: function cuptiSubscribe(&subscriber, CUpti_CallbackFunc)my_callback, NULL) failed with error CUPTI_ERROR_NOT_INITIALIZED

I've tried all examples in CUPTI/samples and concluded that programs that use callbackAPI and activityAPI will fail under cuda-gdb. (They are all well-behaved without cuda-gdb) But the fail reason differs: If I have calls from activityAPI, then once run it under cuda-gdb, it'll hang for a minute then exit with error:

The CUDA driver has hit an internal error. Error code: 0x100ff00000001c Further execution or debugging is unreliable. Please ensure that your temporary directory is mounted with write and exec permissions.

If I have calls from callbackAPI like my own program, then it'll fail out much sooner with the same error:

CUPTI_ERROR_NOT_INITIALIZED

Any experience on this kinda issue? I really appreciate that!

Solution

According to NVIDIA forum posting here and also referred to here, the CUDA "tools" must be used uniquely. These tools include:

CUPTI
any profiler
cuda-memcheck
a debugger

Only one of these can be "in use" on a code at a time. It should be fairly easy for developers to use a profiler, or cuda-memcheck, or a debugger independently, but a possible takeaway for those using CUPTI, who also wish to be able to use another CUDA "tool" on the same code, would be to provide a coding method to be able to disable CUPTI use in their application, when they wish to use another tool.