Search code examples
pythongputensorflow

How to get current available GPUs in tensorflow?


I have a plan to use distributed TensorFlow, and I saw TensorFlow can use GPUs for training and testing. In a cluster environment, each machine could have 0 or 1 or more GPUs, and I want to run my TensorFlow graph into GPUs on as many machines as possible.

I found that when running tf.Session() TensorFlow gives information about GPU in the log messages like below:

I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)

My question is how do I get information about current available GPU from TensorFlow? I can get loaded GPU information from the log, but I want to do it in a more sophisticated, programmatic way. I also could restrict GPUs intentionally using the CUDA_VISIBLE_DEVICES environment variable, so I don't want to know a way of getting GPU information from OS kernel.

In short, I want a function like tf.get_available_gpus() that will return ['/gpu:0', '/gpu:1'] if there are two GPUs available in the machine. How can I implement this?


Solution

  • Since TensorFlow 2.1, you can use tf.config.list_physical_devices('GPU'):

    import tensorflow as tf
    
    gpus = tf.config.list_physical_devices('GPU')
    for gpu in gpus:
        print("Name:", gpu.name, "  Type:", gpu.device_type)
    

    If you have two GPUs installed, it outputs this:

    Name: /physical_device:GPU:0   Type: GPU
    Name: /physical_device:GPU:1   Type: GPU
    

    In TF 2.0, you must add experimental:

    gpus = tf.config.experimental.list_physical_devices('GPU')
    

    See: