I am interested in getting GNU Parallel to run some numerical computation tasks on the GPU. Generically speaking, here is my initial approach:
This brought up the following questions:
Modern CPUs have multiple cores, that means they can run different instructions at the same time; so when core 1 is running a MUL core 2 may be running an ADD. This is also called MIMD - Multiple Instructions, Multiple Data.
GPUs, however, cannot run different instructions at the same time. They excel in running the same instruction on a large amounts of data; SIMD - Single Instruction, Multiple Data.
Modern GPUs have multiple cores that are each SIMD.
So where does GNU Parallel fit into this mix?
GNU Parallel starts programs. If your program uses a GPU and you have one single GPU core on your system, GNU Parallel will not make much sense. But if you have, say, 4 GPU cores on your system, then it makes sense to keep these 4 cores running at the same time. So if your program reads the variable CUDA_VISIBLE_DEVICES to decide which GPU core to run on, you can do something like this:
seq 10000 | parallel -j4 CUDA_VISIBLE_DEVICES='$(({%} - 1))' compute {}