Search code examples
metal

I cannot run any metal compute shader in my phone


I am trying to run my metal program on my iPhone SE.

I tried many numbers for threadsPerThreadGroup and threadsPerGrid sizes and all of them gave me this error: TLValidateFeatureSupport:3539: failed assertion `Dispatch Threads with Non-Uniform Threadgroup Size is only supported on MTLGPUFamilyApple4 and later.'

Here is my code.

var threadsPerThreadGroup: MTLSize
var threadsPerGrid: MTLSize

computeCommandEncoder.setComputePipelineState(updateShader)

let w = updateShader.threadExecutionWidth

threadsPerThreadGroup = MTLSize(width: w, height: 1, depth: 1)
threadsPerGrid = MTLSize(width: Int(constants.bufferLength), height: 1, depth: 1)

if(frames % 2 == 0) {
    computeCommandEncoder.setBuffer(buffer1, offset: 0, index: 0)
    computeCommandEncoder.setBuffer(buffer2, offset: 0, index: 1)
} else {
    computeCommandEncoder.setBuffer(buffer2, offset: 0, index: 0)
    computeCommandEncoder.setBuffer(buffer1, offset: 0, index: 1)
}

 computeCommandEncoder.setBytes(&constants, length: MemoryLayout<MyConstants>.stride, index: 2)

computeCommandEncoder.dispatchThreads(threadsPerGrid, threadsPerThreadgroup: threadsPerThreadGroup)

frames += 1

I am using iOS 13.4 and XCode 11.4.

threadExecutionWidth evaluates to 32 and constants.bufferLength is 512.


Solution

  • Use [dispatchThreads] only if the device supports non-uniform threadgroup sizes.

    That is not worded as clearly as it could be. It means that dispatchThreads does not work on pre-A11 GPUs.

    If you want a solution that works on all devices, you have to calculate how many threadgroups go into a grid yourself, and use dispatchThreadgroups.

    If you want to have both methods in your code, you can detect the device's feature set at runtime.