Search code examples
openglgraphicsgpu

Why isn't gl_InstanceID a dynamically uniform expression?


In OpenGL vertex shaders, the only builtin input which is considered dynamically uniform is gl_DrawID. I can guess that the decision to not make gl_InstanceID dynamically uniform was made to allow implementations to group vertices from different instances together into a single vertex shader warp(/wavefront/whatever). However, it's well-known that instanced draws with a small number of vertices per instance are bad for performance, as no major desktop GPU vendors actually group multiple instances into a single warp (which results in low warp occupancy with many instances with few vertices each). This would seem to imply that in practice, gl_InstanceID could very well be made dynamically uniform. What was the rationale for not making gl_InstanceID (and also gl_BaseInstance and gl_BaseVertex) a dynamically uniform expression? Are there any GPUs around which can actually group multiple instances per wavefront, or were there at the time the specification was being written?


Solution

  • Dynamically uniform expressions in OpenGL are defined as expressions that have the same value for all invocations in the same invocation group. An invocation group in rendering mode is the combination of all invocations of all shaders caused by a single draw command. Regular draw commands (glDrawElements, glDrawArrays) and instance draw command (glDrawElementsInstanced, glDrawArraysInstanced, ...) are considered a single draw command, in contrast to multi draw commands, which have a separate invocation group for each of their subcommands.

    How the GPU groups the rendering commands into warps (a concept OpenGL doesn't have for rendering invocations), doesn't really play a role here since this is not the deciding factor to identify dynamically uniform expressions.

    Based on these definitions, it should be easy to see why gl_InstanceID can't be dynamically uniform unless there is exactly one instance.

    Please note, that the definition of invocation group stated above is only valid for rendering mode. Compute mode has it's own definition.