I understand that the expression x = m*a+d
is most efficiently written as x = mad(m,a,d)
because at the assembly level only one instruction is needed rather than a multiply and add separately. My question regards optimally writing this expression: x += m*a
. Should this be written as x = mad(m,a,x)
or x += m*a
? The difference is too subtle to profile but I'm wondering if anyone can see the difference at the assembly level. (I don't know how to view the assembly code.)
Compile the shader with 'fxc' and have a look at the compiled DXBC it prints out.
In this case, the compiler has no trouble determining that x += m * a; is the same as x = mad(m,a,x); and so the resulting generated code will be the same.
For both expressions the resulting bytecode is produced:
mad o0.x, v0.x, v0.y, v0.z
Even when written as x = m * a + d; the compiler is free to (and does) use a mad instruction in the DirectX bytecode. There is no performance penalty to writing it in any of the three ways you described.