c++c optimization bit-manipulation branchless

Constrain a 16 bit signed value between 0 and 4095 using Bit Manipulation only (without branching)

I want to constrain the value of a signed short variable between 0 and 4095, after which I take the most significant 8 bits as my final value for use elsewhere. Right now I'm doing it in a basic manner as below:

short color     = /* some external source */;
/* 
 * I get the color value as a 16 bit signed integer from an
 * external source I cannot trust. 16 bits are being used here
 * for higher precision.
 */

if ( color < 0 ) {
    color = 0;
}
else if ( color > 4095 ) {
    color = 4095;
}

unsigned char color8bit  = 0xFF & (color >> 4);
/*
 * color8bit is my final value which I would actually use
 * in my application.
 */

Is there any way this can be done using bit manipulation only, i.e. without using any conditionals? It might help quite a bit in speeding things up as this operation is happening thousands of time in the code.

The following won't help as it doesn't take care of edge cases such as negative values and overflows:

unsigned char color8bit = 0xFF & (( 0x0FFF & color ) >> 4 );

Edit: Adam Rosenfield's answer is the one which takes the correct approach but its incorrectly implemented. ouah's answer gives correct results but takes a different approach that what I originally intended to find out.

This is what I ended up using:

const static short min = 0;
const static short max = 4095;
color = min ^ (( min ^ color ) & -( min < color ));
color = max ^ (( color ^ max ) & -( color < max ));
unsigned char color8bit = 0xFF & (( 0x0FFF & color ) >> 4 );

Solution

short color = /* ... */
color =   ((((!!(color >> 12)) * 0xFFF)) | (!(color >> 12) * color ))
        & (!(color >> 15) * 0xFFF);

unsigned char color8bit  = 0xFF & (color >> 4);

It assumes two's complement representation.

This has the advantage of not using any equality or relational operators. There are situations you want to avoid branches at all costs: in some security applications you don't want the attackers to perform branch predictions. Without branches (in embedded processors particularly) you can make your function run in constant time for all inputs.

Note that: x * 0xFFF can be further reduced to (x << 12) - x. Also the multiplication in (!(color >> 12) * color ) can also be further optimized as the left operand of * here is 0 or 1.

EDIT:

I add a little explanation: the expression above simply does the same as below without the use of the conditional and relational operators:

y =   ((y > 4095 ? 4095 : 0) | (y > 4095 ? 0 : y))
    & (y < 0 ? 0 : 4095);

EDIT2:

as @HotLicks correctly noted in his comment, the ! is still a conceptual branch. Nevertheless it can also be computed with bitwise operators. For example !!a can be done with the trivial:

b = (a >> 15 | a >> 14 | ... | a >> 1 | a) & 1

and !a can be done as b ^ 1. And I'm sure there is a nice hack to do it more effectively.