Search code examples
c++memory-leaksnvidianvml

Memory leak using Nvidia's NVML


In a project I'm using the nvml lib to get info about the GPU in a system. I use it to query the GPU name and GPU UUID. This happens cyclic 6 to 8 time per minute. I noticed a small memory leak which causes a crash of my application after a few hours. The way I am using nvml to query GPU device name is like followed:

nvmlReturn_t result = nvmlInit();

nvmlDevice_t device;
result = nvmlDeviceGetHandleByIndex(deviceNum, &device);

char nameBuffer[NVML_DEVICE_NAME_BUFFER_SIZE];
result = nvmlDeviceGetName(device, nameBuffer, NVML_DEVICE_NAME_BUFFER_SIZE);

result = nvmlShutdown();

But even if I change the code just an init and shutdown of nvml the used memory is still constantly increasing:

nvmlReturn_t result = nvmlInit();

// nvmlDevice_t device;
// result = nvmlDeviceGetHandleByIndex(deviceNum, &device);

// char nameBuffer[NVML_DEVICE_NAME_BUFFER_SIZE];
// result = nvmlDeviceGetName(device, nameBuffer, NVML_DEVICE_NAME_BUFFER_SIZE);

result = nvmlShutdown();

Am I using the API correct or is there something wrong? Is there a known issue in the nvml lib?

Systeminfo:
OS: Windows 10
Nvidia Driver: 536.40
Cuda: 12.2


Solution

  • Here the simple solution to the problem:

    I tested with the simple c file from Homer512 from the comments (just init and shutdown nvml in a loop). Over time the test system ran out of memory.

    Then I updated the Nvidia driver to latest version (556.12). This seems to fix the issue.