My input data is 16-bit data, and I need to find a median of 3 values using SSE2 instruction set.
If I have 3 16-bits input values A, B and C, I thought to do it like this:
D = max( max( A, B ), C )
E = min( min( A, B ), C )
median = A + B + C - D - E
C functions I am planing to use are :
Can anyone suggest a better way?
Your idea is quite clever but you can do it with fewer operations just using max and min.
t1 = min(A, B)
t2 = max(A, B)
t3 = min(t2, C)
median = max(t1, t3)
This will be just 4 SSE instructions compared with 8 in your original implementation.
Note that this is actually just a pruned sorting network for N = 3.