I have a script that looks like this:
for x in ...; do
for y in ...; do
# run several commands which depends on x and y and requires a single GPU
# (I also need to specify which GPU to use)
command1 $x $y GPU0
command2 $x $y GPU0
done
done
# Some stuff after the loop
I have 4 GPUs. I want to make the loop parallel. I.e. for the current (x,y)
iteration, I want to wait until some GPU is available, run the commands, and go to the next iteration (without waiting for the current iteration to finish). How do I do this?
I know about flock
command, so I can create a lock file for each GPU and use it to control access to the GPU. But, as I understand, it requires me to know which GPU my current (x,y)
iteration plans to use.
And another concern is how to guarantee that at every iteration I use correct x
and y
. I.e., when we go to the next iteration, x
and y
change, and it must not be reflected in command1 $x $y GPU...
at the previous iteration.
If you have GNU Parallel:
parallel -j4 'echo Do {1} and {2} on GPU {%}; sleep 1.$RANDOM;' ::: a b c ::: X Y Z
To follow the progress use --lb
:
parallel -j4 --lb 'echo Do {1} and {2} on GPU {%}; sleep 1.$RANDOM; echo GPU {%} done' ::: a b c ::: X Y Z
To use variable in the command you need to be aware of quoting and exporting the variable:
a='a b ** $ < >'
export a
parallel 'echo "$a"' ::: test
For details see: https://www.gnu.org/software/parallel/man.html#quoting and https://www.gnu.org/software/parallel/man.html#example-using-shell-variables
If the content of the variable does not change when eval
ed, you can get away with not quoting:
a=ab
parallel echo $a ::: test