The [branch]
attribute can mark an if
statement in HLSL to make it execute only one branch instead of all branches and discarding the results like when using [flatten]
.
My question is how can this actually work, when a branch diverges withing a warp/wavefront? As far as I know, in this case all threads must execute all branches taken by any of the threads in the warp (like when using [flatten]
) which is consequence of the fact, that they are all within the same SIMD block and must execute the same instruction.
Since GeForce series 6xx GPUs do actually support branching, though in limited form and with performance cost. The [branch]
and [flatten]
tags are just hints to the compiler to prefer one or the other if supported and possible. It basically depends on hardware and on the driver, so different hardware or different driver versions might in the end determine a different execution from what you specified with the tag.
You can find more info online, for example check this link