I am optimizing bottleneck code:
int sum = ........
sum = (sum >> _bitShift);
if (sum > 32000)
sum = 32000; //if we get an overflow, saturate output
else if (sum < -32000)
sum = -32000; //if we get an underflow, saturate output
short result = static_cast<short>(sum);
I would like to write the saturation condition as one "if condition" or even better with no "if condition" to make this code faster. I don't need saturation exactly at value 32000, any similar value like 32768 is acceptable.
According this page, there is a saturation instruction in ARM. Is there anything similar in x86/x64?
Are you sure you can beat the compiler at this?
Here's x64 retail with max size optimizations enabled. Visual Studio v15.7.5.
ecx contains the intial value at the start of this block. eax is filled with the saturated value when it is done.
return (x > 32767) ? 32767 : ((x < -32768) ? -32768 : x);
mov edx,0FFFF8000h
movzx eax,cx
cmp ecx,edx
cmovl eax,edx
mov edx,7FFFh
cmp ecx,edx
movzx eax,ax
cmovg eax,edx