Search code examples
c++arraysinttype-conversioncbor

Byte vector to integer type: shift and add or implicit conversion through a union?


I am currently implementing CBOR and repeatedly need to read 1, 2, 4, or 8 bytes from a byte array which then need to be combined to an integer type of 1, 2, 4, or 8 bytes.

For the 4 byte case, I currently use this template function (vec is the byte vector I am reading from, current_idx marks the position in the vector from where I want to start reading 4 bytes):

template<typename T, typename std::enable_if<sizeof(T) == 4, int>::type = 0>
static T get_from_vector(const std::vector<uint8_t>& vec, const size_t current_idx)
{
    return static_cast<T>((static_cast<T>(vec[current_idx]) << 030) +
                          (static_cast<T>(vec[current_idx + 1]) << 020) +
                          (static_cast<T>(vec[current_idx + 2]) << 010) +
                          static_cast<T>(vec[current_idx + 3]));
}

(I have three similar functions for the case of 1, 2, and 8 bytes, respectively.)

An example call would be

std::vector<uint8_t> vec {0x01, 0x00, 0x00, 0xff};
auto num = get_from_vector<uint32_t>(vec, 0);
assert(num == 0x10000FF);

Though performance seems not the issue here, but I wonder nevertheless whether the following code may be more efficient or at least more readable:

template<typename T, typename std::enable_if<sizeof(T) == 4, int>::type = 0>
static T get_from_vector(const std::vector<uint8_t>& vec, const size_t current_idx)
{
    union U
    {
        T result_type;
        uint8_t bytes[4];
    } u;
    u.bytes[3] = vec[current_idx];
    u.bytes[2] = vec[current_idx + 1];
    u.bytes[1] = vec[current_idx + 2];
    u.bytes[0] = vec[current_idx + 3];
    return u.result_type;
}

Any thoughts on this?


Solution

  • Personally I prefer your second choice (using unions), because it seems to be a little faster and more readable.

    But there's another way to define your function: using pointers. A benefit is that you'll need to define only one function instead of overload it.

    template<typename T>
    static T get_from_vector(const std::vector<uint8_t>& vec, const size_t current_index){
        T result;
        uint8_t *ptr = (uint8_t *) &result;
        size_t idx = current_index + sizeof(T);
        while(idx > current_index)
            *ptr++ = vec[--idx];
        return result;
    }
    

    Altering your example of calling:

    int main(){
        std::vector<uint8_t> vec {0x01, 0x00, 0x00, 0xff, 0x01, 0x00, 0x00, 0xff};
    
        auto byte1 = get_from_vector<uint8_t>(vec, 3);
        assert(byte1 == 0xff);
    
        auto byte2 = get_from_vector<uint16_t>(vec, 3);
        assert(byte2 == 0xff01);
    
        auto byte4 = get_from_vector<uint32_t>(vec, 4);
        assert(byte4 == 0x010000ff);
    
        auto byte8 = get_from_vector<uint64_t>(vec, 0);
        assert(byte8 == 0x010000ff010000ffUL);
    }