Search code examples
c++charreinterpret-castuint8t

How to work with uint8_t instead of char?


I wish to understand the situation regarding uint8_t vs char, portability, bit-manipulation, the best practices, state of affairs, etc. Do you know a good reading on the topic?

I wish to do byte-IO. But of course char has a more complicated and subtle definition than uint8_t; which I assume was one of the reasons for introducing stdint header.

However, I had problems using uint8_t on multiple occasions. A few months ago, once, because iostreams are not defined for uint8_t. Isn't there a C++ library doing really-well-defined-byte-IO i.e. read and write uint8_t? If not, I assume there is no demand for it. Why?

My latest headache stems from the failure of this code to compile:

uint8_t read(decltype(cin) & s)
{
    char c;
    s.get(c);
    return reinterpret_cast<uint8_t>(c);
}

error: invalid cast from type 'char' to type 'uint8_t {aka unsigned char}'

Why the error? How to make this work?


Solution

  • The general, portable, roundtrip-correct way would be to:

    1. demand in your API that all byte values can be expressed with at most 8 bits,
    2. use the layout-compatibility of char, signed char and unsigned char for I/O, and
    3. convert unsigned char to uint8_t as needed.

    For example:

    bool read_one_byte(std::istream & is, uint8_t * out)
    {
        unsigned char x;    // a "byte" on your system 
        if (is.get(reinterpret_cast<char *>(&x)))
        {
            *out = x;
            return true;
        }
        return false;
    }
    
    bool write_one_byte(std::ostream & os, uint8_t val)
    {
        unsigned char x = val;
        return os.write(reinterpret_cast<char const *>(&x), 1);
    }
    

    Some explanation: Rule 1 guarantees that values can be round-trip converted between uint8_t and unsigned char without losing information. Rule 2 means that we can use the iostream I/O operations on unsigned char variables, even though they're expressed in terms of chars.

    We could also have used is.read(reinterpret_cast<char *>(&x), 1) instead of is.get() for symmetry. (Using read in general, for stream counts larger than 1, also requires the use of gcount() on error, but that doesn't apply here.)

    As always, you must never ignore the return value of I/O operations. Doing so is always a bug in your program.