Search code examples
cbit-manipulationbitwise-operators128-bit

Bitwise operations on 128-bit values on a non-sse2 arch


I am writing a routine in C, targeted for an embedded platform.
In the routine I need to perform bitwise XOR and SHIFT RIGHT operations on 128-bit values.
The target arch doesn't have SSE2, hence no native 128-bit operations supported.
I came across this answer which simulates the SHIFT operations in software.
My question is, are there better ways of doing this, I mean with better data structure to represent 128-bit values and optimal way to simulate the SHIFT and XOR operations than using recursion(as done in the answer in the link). I wish to minimise usage of the limited stack memory.


Solution

  • You can use a structure to store 128 bit data as follows

    typedef struct
    {
        uint32_t a;
        uint32_t b;
        uint32_t c;
        uint32_t d;
    } Type_128bit;
    

    Then you can write a left shift function as follows

    int leftshift(Type_128bit in, Type_128bit out, int value)
    {
        int val;
        if (value >= 128)
        {
            return (-1); // error condition
        }
        else if (value < 32)
        {
            out->a = (in->a << value) | (in->b >> value);
            out->b = (in->b << value) | (in->c >> value);
            out->c = (in->c << value) | (in->d >> value);
            out->d = in->d << value;
        }
        else if (value < 64)
        {
            val = value - 32;
            out->a = (in->b << val) | (in->c >> val);
            out->b = (in->c << val) | (in->d >> val);
            out->c = (in->d << val);
            out->d = 0x00;
        }
        else if (value < 96)
        {
            val = value - 64;
            out->a = (in->c << val) | (in->d >> val);
            out->b = (in->d << val);
            out->c = 0x00;
            out->d = 0x00;
        }
        else // value < 128
        {
            val = value - 96;
            out->a = (in->d << val);
            out->b = 0x00;
            out->c = 0x00;
            out->d = 0x00;
        }
        return (0); //success
    }
    

    This will avoid the recursion of the mentioned solution and give better runtime. But code size will increase and you need to carefully test the code.