c++type-conversion arm machine-code sign-extension

How to safely extract a signed field from a uint32_t into a signed number (int or uint32_t)

I have a project in which I am getting a vector of 32-bit ARM instructions, and a part of the instructions (offset values) needs to be read as signed (two's complement) numbers instead of unsigned numbers.

I used a uint32_t vector because all the opcodes and registers are read as unsigned and the whole instruction was 32-bits.

For example:

I have this 32-bit ARM instruction encoding:

uint32_t addr = 0b00110001010111111111111111110110

The last 19 bits are the offset of the branch that I need to read as signed integer branch displacement. This part: 1111111111111110110

I have this function in which the parameter is the whole 32-bit instruction: I am shifting left 13 places and then right 13 places again to have only the offset value and move the other part of the instruction.

I have tried this function casting to different signed variables, using different ways of casting and using other c++ functions, but it prints the number as it was unsigned.

int getCat1BrOff(uint32_t inst)
{
    uint32_t temp = inst << 13;
    uint32_t brOff = temp >> 13;
    return (int)brOff;
}

I get decimal number 524278 instead of -10.

The last option that I think is not the best one, but it may work is to set all the binary values in a string. Invert the bits and add 1 to convert them and then convert back the new binary number into decimal. As I would of do it in a paper, but it is not a good solution.

Solution

It boils down to doing a sign extension where the sign bit is the 19th one. There are two ways.

Use arithmetic shifts.
Detect sign bit and or with ones at high bits.

There is no portable way to do 1. in C++. But it can be checked on compilation time. Please correct me if the code below is UB, but I believe it is only implementation defined - for which we check at compile time. The only questionable thing is conversion of unsigned to signed which overflows, and the right shift, but that should be implementation defined.

int getCat1BrOff(uint32_t inst)
{
    if constexpr (int32_t(0xFFFFFFFFu) >> 1 == int32_t(0xFFFFFFFFu))
    {
        return int32_t(inst << uint32_t{13}) >> int32_t{13};
    }
    else
    {
        int32_t offset = inst & 0x0007FFFF;
        if (offset & 0x00040000)
        {
            offset |= 0xFFF80000;
        }
        return offset;
    }
}

or a more generic solution

template <uint32_t N>
int32_t signExtend(uint32_t value)
{
    static_assert(N > 0 && N <= 32);
    constexpr uint32_t unusedBits = (uint32_t(32) - N);
    if constexpr (int32_t(0xFFFFFFFFu) >> 1 == int32_t(0xFFFFFFFFu))
    {
        return int32_t(value << unusedBits) >> int32_t(unusedBits);
    }
    else
    {
        constexpr uint32_t mask = uint32_t(0xFFFFFFFFu) >> unusedBits;
        value &= mask;
        if (value & (uint32_t(1) << (N-1)))
        {
            value |= ~mask;
        }
        return int32_t(value);
    }
}

https://godbolt.org/z/rb-rRB