Search code examples
tensorflowcudamulti-gpu

nvidia-smi reports identical memory usage for both GPUs


Thank you for your time.

Is it unexpected or even pathologic for nvidia-smi to report the same memory usage for both gpus? Specifically, I have a 2 gpu system, and the numerator for "Memory-Usage" listed for both gpus is the same regardless of the circumstances. I should note other features like "Temp" are reported as different.

Context: I am trying to debug a problem that arose when trying to limit the gpus used by a tensorflow program (using for example CUDA_VISIBLE_DEVICES). A hypothesis is that nvidia somehow does not distinguish between the gpus. All nvidia/cuda drivers seem to be installed correctly as gpu-accelerated programs can run fine if simply all gpus are used.

Specs: 2 TitanX (Pascal) gpus, z10ped-16 motherboard.


Solution

  • Problem seemed to be resolved by upgrading drivers.