Search code examples
selectionparallel-processingshadersimdcg

How do you implement an efficient parallel SIMD compare and select in Cg?


How do you do parallel selection efficiently ?

For example, given this scalar code, is there a way to write it so the Cg compiler will make the code execute in parallel / SIMD (and potential using a branchfree selection as well).

            Out.x = ( A.x <= threshold) ? B.x : C.x ;
            Out.y = ( A.y <= threshold) ? B.y : C.y ;
            Out.z = ( A.z <= threshold) ? B.z : C.z ;
            Out.w = ( A.w <= threshold) ? B.w : C.w ;

Solution

  • Apparently, I missed these line in the Cg manual:

    The ?:, ||, &&, &, and comparison operators can
    be used with bool vectors to perform multiple
    conditional operations simultaneously.
    

    So I tried this out and it seems to work:

    Out.xyzw = ( A.xyzw <= threshold) ? B.xyzw : C.xyzw ;
    

    I guess I didn't expect the simplest solution to just work!

    My coworker who is a graphics programmer also suggested that on some platforms, the Cg compiler might be intelligent enough to optimize the original source code for me but that it's not guaranteed and it is always better to explicitly specify parallel SIMD operations if possible.