Xcode provide the per-frame GPU capture, and I use it in a metal demo. I am confused about some parameters when I go deep into the performance of the function drawIndexdPrimitive.
I wonder what is the meaning of Texture Unit(Shader core) Time and stall time. Xcode only hint that it is texture active or stall. Why they don't add up to 100%? And what is the meaning of texture being active? Another question is that what is the meaning of GPU ring bandwidth? Does the number 9.31 mean the available bandwidth?
Shader core time indicates how much time your shader is spending executing ALU instruction (i.e. math).
Texture core time indicates how much time the shader spends fetching data from textures.
The stall times indicates the time the shader is waiting on the other core before it can execute its instructions.
What you've got shows that the shader core is spending ~87.3% of its time waiting for the texture core to fetch data across the GPU bus and filter the data before it can actually execute the math instruction
This means your shader isn't fully utilizing the shader cores.
There are a could of things you could do: