A byte type: std::byte vs std::uint8_t vs unsigned char vs char vs std::bitset<8>

C++ has a lot of types that vaguely describe the same thing. Assuming that we are compiling for an architecture where a byte is 8-bit, all of the following types are vaguely similar:

std::byte
std::uint8_t
std::bitset<8>
unsigned char (8-bit)
char (8-bit)

If a byte is 8-bit, are all these types more or less interchangeable? If not, when would one need to be used instead of another?

I often see questions like Converting a hex string to a byte array on Stack Overflow where someone uses std::uint8_t, char, unsigned char and other types to represent a "byte". Is this just a matter of stylistic preference?

_{Note: This Q&A is intended to be a community FAQ, and edits are encouraged. The question of when to use what type for a "byte" and why comes up all the time, despite C++17 having introduced std::byte which seemingly makes the choice obvious. Having an FAQ that addresses all the misconceptions about std::bitset, std::uint8_t, etc. being a "byte" is useful. Edits are encouraged.}

Solution

For 8-bit architectures, all the listed types are vaguely similar in the sense that they model something that has 8 bits. However, the use cases are fundamentally different, and only some of these types are guaranteed special properties that make them usable as a byte type.

Overview

Type	Definition	Purpose
`std::byte`	`enum class byte : unsigned char {};`	the canonical byte type ✔️ all special properties
`unsigned char`	fundamental type	character / legacy byte type / small arithmetic type ✔️ all special properties
`signed char`	fundamental type	character / small arithmetic type ❌ no special properties
`char`	fundamental type, same underlying type as `signed char` or `unsigned char`	a character ⚠️ only some special properties
`char8_t`	fundamental type with underlying type `unsigned char`	UTF-8 character ❌ no special properties
`std::uint8_t`	`typedef unsigned char uint8_t;` (This is not guaranteed, just the most common implementation.)	8-bit unsigned arithmetic ⚠️ special properties not guaranteed
`std::bitset<8>`	`template <std::size_t N>` `class bitset;`	set of 8 bits; might be wider than 8 bits ❌ no special properties

See the appendix at the end of the question for a list of all these special properties, type by type.

`std::byte`^(C++17)

This is the canonical byte type in C++. Whenever you have to ask yourself the question "Which type should I use to represent these bytes?", std::byte is the answer.

Note that std::byte is very special because there are many relaxations that allow you to use the type in otherwise undefined ways. For example, the strict aliasing rule is relaxed for std::byte ([basic.lval] p11), meaning that you can examine any object as an array of std::bytes.

Most other types don't have these special powers, and attempting to use them as a byte would be undefined behavior.

As appropriate as std::byte is for raw memory operations, many older APIs such as the <iostream> library predate it and aren't designed around it. The type is also somewhat clunky (e.g. my_byte == 0 is not possible). Don't attempt to forcefully use it with libraries that weren't designed for std::byte.

`unsigned char`

This is the closest thing to a "byte" there is prior to C++17. unsigned char has all the special properties that a std::byte has.

However, the name is very confusing and it's also treated as a character in some contexts. For example, std::ostream::operator<< prints it as an ASCII character, instead of printing its numeric value. Also, doing arithmetic with unsigned char promotes it to int before any operation, which seems inappropriate for a "byte".

All in all, it's a wishy washy type that is simultaneously a byte, a character, and an arithmetic type. Prefer std::byte, char, std::uint8_t, or std::uint_least8_t instead.

`signed char`

The signed counterpart to unsigned char is similarly confused. It has almost none of the special properties that std::byte and unsigned char have, and is a strange mix of arithmetic and character type. It should also be avoided.

A better alternative is std::int_least8_t which is also signed, and also guaranteed to be at least 8 bits wide, but which doesn't have a weird connotation of also being a character.

`char`

This is a distinct type which has the same underlying type as signed char or unsigned char. It has most (but not all) of the special properties of unsigned char and std::byte. For example, unlike unsigned char, it does not provide storage ([intro.object] p3) for objects created in a char[].

char should be used for what the name says: a character.

`char8_t`^(C++20)

There was originally some discussion about this type having special properties akin to char, but it ended up having none. Its underlying type is unsigned char, but it unlike std::byte, this doesn't mean that it inherits any properties from it.

It should be used as a UTF-8 character, possibly within a UTF-8 encoded string.

`std::uint8_t`^(C++11)

This type is a design mistake that has started in C. While this isn't guaranteed, it is usually implemented as type alias like

typedef unsigned char uint8_t;

This means that it has the special properties that unsigned char has in practice (since all compilers implement it like this), but none of this is guaranteed by the standard. The fact that it can alias every other type can also make it detrimental to performance, compared to if it was an alias for a unique type.

One thing to note is that a byte isn't guaranteed to be 8 bits in C++. Many people use std::uint8_t because it offers a perceived safety of really being 8 bits. However, std::uint8_t is optional and doesn't exist on platforms where a byte is wider than 8 bits, so it is no more portable than:

#include <climits>
static_assert(CHAR_BIT == 8); // ... and use unsigned char or char as a byte type

For a more portable 8-bit arithmetic type, there are std::uint_fast8_t and std::uint_fast8_t, which are guaranteed to exist but may be wider than 8 bits.

Note that std::uint8_t, std::uint_least8_t, and std::uint_fast8_t may all be promoted to int, just like unsigned char.

`std::bitset<8>`

This is the furthest from "byte" type. It models sequence of bits, or a set of numbers depending on perspective.

A std:bitset<8> is at least as large as int in most implementations, so it isn't even 8 bits large. Only use this type for what the name says: a set of bits. It is not a byte.

Conclusion

std::byte is the only type which models a byte, nothing more, nothing less. It should be preferred as a byte type whenever possible. All other types are either missing crucial properties or have a fundamentally different purpose than being a byte.

Appendix

Special properties of `std::byte` and ordinary character types

Section	Affected Types	Special Properties
[intro.object] p3	`unsigned char[]`, `std::byte[]`	array provides storage for objects placed inside
[intro.object] p13	`unsigned char[]`, `std::byte[]`	array implicitly creates objects inside when its lifetime begins
[basic.life] p6.4	cv `char`, cv `unsigned char`, and cv `std::byte*`	`static_cast` of pointers to objects outside lifetime is allowed
[basic.indet]	unsigned ordinary character types, `std::byte`	indeterminate results allowed when initializating and assigning
[basic.types.general] p2	`char[]`, `unsigned char[]`, `std::byte[]`	trivially copyable objects can have their value transferred via an array
[basic.lval] p11.3	`char`, `unsigned char`, `std::byte`	relaxed strict aliasing
[expr.new] p16	`char[]`, `unsigned char[]`, `std::byte[]`	stricter alignment in a new-expression
[bit.cast] p2	unsigned ordinary character types, `std::byte`	indeterminate results allowed for `std::bit_cast`

Note: it's unclear what unsigned ordinary character type actually means. See Editorial Issue 5070.

A byte type: std::byte vs std::uint8_t vs unsigned char vs char vs std::bitset<8>

Overview

std::byte(C++17)

unsigned char

signed char

char

char8_t(C++20)

std::uint8_t(C++11)

std::bitset<8>