Search code examples
linuxcudagpunvidiatorch

How to set Torch to use only one gpu when there are two gpu?


My computer has two GPUs. And this is my first time using two GPUs. When I had one GPU, I just run the Cuda program, and it runs on the only one GPU. However, I don't know how to control the program to use which GPU and how to run program on the only one GPU. I searched the Internet and post says

export CUDA_VISIBLE_DEVICES=0

This must be used before run the program. I have two program to run. The one is torch script and the other is Cuda script. I opened two terminals and in the 1st terminal, I used the command above and run the torch program. After that, in the 2nd terminal, I also used the command above by only changing the number from 0 to 1 and run the cuda program.

result of nvidia-smi

However, seeing the picture of nvidia-smi, it shows the two programs are assigned to the 0th GPU. I wanted to assigned torch program(PID 19520) to the 0th and the cuda program(PID 20351) to the 1st GPU.

How can I assign the two program to different GPU devices?

The followings are the settings of the torch script. (Ubuntu 14.04, nvidia titan gtx x, cuda-7.5)

--[[command line arguments]]--
cmd = torch.CmdLine()
cmd:text()
cmd:text('Train a Recurrent Model for Visual Attention')
cmd:text('Example:')
cmd:text('$> th rnn-visual-attention.lua > results.txt')
cmd:text('Options:')
cmd:option('--learningRate', 0.01, 'learning rate at t=0')
cmd:option('--minLR', 0.00001, 'minimum learning rate')
cmd:option('--saturateEpoch', 800, 'epoch at which linear decayed LR will reach minLR')
cmd:option('--momentum', 0.9, 'momentum')
cmd:option('--maxOutNorm', -1, 'max norm each layers output neuron weights')
cmd:option('--cutoffNorm', -1, 'max l2-norm of contatenation of all gradParam tensors')
cmd:option('--batchSize', 20, 'number of examples per batch')
cmd:option('--cuda', true, 'use CUDA')
cmd:option('--useDevice', 1, 'sets the device (GPU) to use')
cmd:option('--maxEpoch', 2000, 'maximum number of epochs to run')
cmd:option('--maxTries', 100, 'maximum number of epochs to try to find a better local minima for early-stopping')
cmd:option('--transfer', 'ReLU', 'activation function')
cmd:option('--uniform', 0.1, 'initialize parameters using uniform distribution between -uniform and uniform. -1 means default initialization')
cmd:option('--xpPath', '', 'path to a previously saved model')
cmd:option('--progress', false, 'print progress bar')
cmd:option('--silent', false, 'dont print anything to stdout')

Solution

  • CUDA_VISIBLE_DEVICES=0 th [torch script]
    CUDA_VISIBLE_DEVICES=1 [CUDA script]