Search code examples
ccastingrotationunsignedsigned

Rotate left and back to the right for sign extension with (signed short) cast in C


Previously, I had the following C code, through which I intended to do sign extension of variable 'sample' after a cast to 'signed short' of variable 'sample_unsigned'.

unsigned short sample_unsigned;
signed short sample;

sample = ((signed short) sample_unsigned << 4) >> 4;

In binary representation, I would expect 'sample' to have its most significant bit repeated 4 times. For instance, if:

sample_unsigned = 0x0800 (corresponding to "100000000000" in binary)

I understand 'sample' should result being:

sample = 0xF800 (corresponding to "1111100000000000" in binary)

However, 'sample' always ended being the same as 'sample_unsigned', and I had to split the assignment statement as below, which worked. Why this?

sample = ((signed short) sample_unsigned << 4);
sample >>= 4;

Solution

  • Your approach will not work. There is no gaurantee right shifting will preserve the sign. Even if, it would only work for 16 bit int. For >=32 bit int you have to replicate the sign manually into the upper bits, otherwise it just shifts the same data back and forth. In general, bitshifts of signed values are critical - see the [standard](http://port70.net/~nsz/c/c11/n1570.html#6.5.7 for details. Some constellations invoke undefined behaviour. It is better to avoid them and just work with unsigned integers.

    For most platforms, the following works, however. It is not necessarily slower (on platforms with 16 bit int, it is likely even faster):

    uint16_t usample;
    int16_t ssample;
    
    ssample = (int16_t)usample;
    if ( ssample & 0x800 )
        ssample |= ~0xFFF;
    

    The cast to int16_t is implementation defined; your compiler shall specify how it is performed. For (almost?) all recent implementations no extra operation is performed. Just verify in the generated code or your compiler documentation. The logical-or relies on intX_t using 2s complement which is guaranteed by the standard - as opposed to the standard types.

    On 32 bit platforms, there might be an intrinsic instruction to sign-extend (e.g. ARM Cortex-M3/4 SBFX). Or the compiler provides a builtin function. Depending on your use-case and speed requirements, it might be suitable to use them.

    Update:

    An alternative approach would be using a bitfield structure:

    struct {
        int16_t val : 12;    // assuming 12 bit signed value like above
    } extender;
    
    extender.val = usample;
    ssample = extender.val;
    

    This might result in using the same assembler instructions I proposed above.