Is it possible to run Metal code on two or more GPUs at the same time?

I have some parallel computing tasks written in Metal. I am wondering if I can run the metal kernel on two or more GPUs at the same time?

Solution

Yes.

If, for example, you're on a Mac with a discrete GPU and an integrated GPU, there will be multiple elements in the array returned by a call to MTLCopyAllDevices(). Same if you have one or more external GPUs connected to your Mac.

In order to run the same compute kernel on each GPU, you'll need to create separate resources and pipeline state objects, since these objects are all affiliated with a single MTLDevice. Everything else about encoding and enqueueing work remains the same.

Except in limited cases (i.e., when GPUs occupy the same peer group), you can't copy resources directly between GPUs. You can, however, use a MTLBlitCommandEncoder to copy shared or managed resources via the system bus.

If there are dependencies among the compute commands across devices, you may need to use events to explicitly synchronize them.