Search code examples
ioscore-imagemetalcikernel

Metal vs GLSL CoreImage performance


In WWDC session 510, Apple engineers present support for coding CIKernel in Metal and claim it should work faster.

I've made together a test project which implements motion blur in both metal and glsl (code is similar to the one from 510 session).

Sometimes metal kernel is faster, sometimes glsl kernel is faster, but I definitely can't see metal kernel perform consistency and significantly better across the board. Is it supposed to be like this, am I missing out something?

Note: the project won't run on simulator, you'd need A8+ powered device.


Solution

  • Looks like at some of this is hardware-related. Here's my iPad Pro 10.5 inch results:

    glsl 1 took 229.572057723999ms
    glsl 2 took 49.1310358047485ms
    glsl 3 took 46.7269420623779ms
    glsl 4 took 53.08997631073ms
    glsl 5 took 48.9979982376099ms
    glsl 6 took 49.0390062332153ms
    glsl 7 took 52.5139570236206ms
    glsl 8 took 46.4930534362793ms
    glsl 9 took 39.6310091018677ms
    glsl 10 took 45.9860563278198ms
    metal 1 took 77.7549743652344ms
    metal 2 took 44.1800355911255ms
    metal 3 took 46.0859537124634ms
    metal 4 took 45.3709363937378ms
    metal 5 took 43.5279607772827ms
    metal 6 took 38.9848947525024ms
    metal 7 took 37.1809005737305ms
    metal 8 took 37.8340482711792ms
    metal 9 took 37.6850366592407ms
    metal 10 took 37.5720262527466ms
    

    And my iPhoneSE results:

    glsl 1 took 394.147992134094ms
    glsl 2 took 94.601035118103ms
    glsl 3 took 81.4379453659058ms
    glsl 4 took 76.9931077957153ms
    glsl 5 took 77.0320892333984ms
    glsl 6 took 75.8579969406128ms
    glsl 7 took 76.9950151443481ms
    glsl 8 took 77.8199434280396ms
    glsl 9 took 79.7009468078613ms
    glsl 10 took 79.4800519943237ms
    metal 1 took 146.992921829224ms
    metal 2 took 88.6669158935547ms
    metal 3 took 81.8150043487549ms
    metal 4 took 78.1329870223999ms
    metal 5 took 79.5910358428955ms
    metal 6 took 93.6589241027832ms
    metal 7 took 94.8940515518188ms
    metal 8 took 89.0530347824097ms
    metal 9 took 84.3830108642578ms
    metal 10 took 77.949047088623ms
    

    A question and a thought:

    • What device produced your results?
    • I'd be curious if a different kind of filter, say a color kernel would perform differently.