Search code examples
objective-cxcodegpumetal

What is the meaning of some parameters in metal by using xcode GPU capture?


Xcode provide the per-frame GPU capture, and I use it in a metal demo. I am confused about some parameters when I go deep into the performance of the function drawIndexdPrimitive. enter image description here

I wonder what is the meaning of Texture Unit(Shader core) Time and stall time. Xcode only hint that it is texture active or stall. Why they don't add up to 100%? And what is the meaning of texture being active? Another question is that what is the meaning of GPU ring bandwidth? Does the number 9.31 mean the available bandwidth?


Solution

  • Shader core time indicates how much time your shader is spending executing ALU instruction (i.e. math).

    Texture core time indicates how much time the shader spends fetching data from textures.

    The stall times indicates the time the shader is waiting on the other core before it can execute its instructions.

    What you've got shows that the shader core is spending ~87.3% of its time waiting for the texture core to fetch data across the GPU bus and filter the data before it can actually execute the math instruction

    This means your shader isn't fully utilizing the shader cores.

    There are a could of things you could do:

    • You could add some unrelated math to the shader without it affecting the performance of this shader
    • You could use a a different algorithm so that your math isn't as dependent on the texture data and doesn't need to wait as much.
    • You could reorder your draw operations or vertex data so that texture cache misses are less frequent and therefore faster