Search code examples
c++structc++20unionsstdformat

std::formatter c++ 20 with union and struct


Is there any reason why std::format doesn't compile using a data type like this one:

#include <cstdint>

typedef union MIDI_EVENT_type_u
{
    uint8_t val;
    struct {
        uint8_t low : 4;
        uint8_t high : 4;
    };
} MIDI_EVENT_type_u;

using std::format to print the uint8_t value works only for the val above, the low and high are not compiling and must be casted to an unsigned int.

Is there any reason for that as std::format is fed with a uint8_t?

e.g.

MIDI_EVENT_type_u type = {0};
auto a = std::format("{}", type.val);
auto b = std::format("{}", type.high); // error
auto c = std::format("{}", type.low);  // error

the error is:

no instance of overloaded function "std::format" matches the argument list
argument types are: (const char [3], uint8_t)

Is there a way to write a custom std::formatter to do the cast to unsigned automatically in this case?

Besides, It is not clear why the value in the "union.struct" are not compiled even if are detected as uint8_t, any reason for that?


EXTENSION:

After the answer and comments from @user17732522 (thanks), for this particulat case there are these kind of solution (at least for compiling, not sure if it will resolve type punning completely):

changing that struct, when is possible to something like:

typedef union MIDI_EVENT_type_u
{
    uint8_t val;
    // the struct also give it a name and put private would be better.
    struct {
        uint8_t low : 4;
        uint8_t high : 4;
    };
    constexpr uint8_t getHigh() { return high; };
    constexpr uint8_t getLow() { return low; };

} MIDI_EVENT_type_u;

and using the getHigh() and getLow() method will resolve the compilation error, or even using a global constexpr to return the .high or .low values eventually.

It looks to me that is something missing in C++ to deal with this scenario that is kind of a basic use case since C '70s...

ERGO: the C++ in this case could also "auto generate" or defaulting to do something like those 2 extra constexpr, no need to be such verbose in writing code, in my opinion.


instead if that is a C struct that can't be modified:

typedef union MIDI_EVENT_type_u
{
    uint8_t val;
    struct {
        uint8_t low : 4;
        uint8_t high : 4;
    };
} MIDI_EVENT_type_u;
 constexpr uint8_t get(const uint8_t val) { return val; };

in this way it will also compile std::format("{}", get(type.low)); It looks excessevily verbose to do "basic operations" that where always working in C/C++ in older time, I don't really understand that.

Is this something changed in the modern C++ standard and was once instead supported?

If the latter is the case shouldn't be considered on the next C++ iteration to support such base cases and make the compiler generating underneath such simple boilerplate code as it can just be resolved with a constexpr getting a type and returning the same type?


Solution

  • std::format takes its arguments by-reference (specifically as forwarding reference).

    A bit-field can't be passed by reference. So you do need to first convert to a prvalue of some type from which then a temporary object can be materialized to which the reference parameter can be bound, e.g. by writing

    std::format("{}", uint8_t{type.high});
    

    or if you don't want to repeat the type:

    std::format("{}", decltype(type.high){type.high});
    

    With C++23 you can use the new auto syntax:

    std::format("{}", auto{type.high});
    

    You will have this problem with a huge part of the standard library when using bit-fields. The standard library typically takes generic arguments by forwarding reference.

    Bit-fields do not behave like normal class members or objects and are pretty much second-class to them. I'd suggest avoiding bit-fields or writing accessor member functions that return you the value so that you can use them directly. If this is a C API, then I'd suggest writing a proper C++ API around it first.


    Also, the syntax using an anonymous struct is not standard C++. A union can be anonymous, but a struct cannot, in standard C++. However, compilers are likely to support this in C++ as an extension, because it is valid syntax in C11.

    Also, accessing type.high and type.low causes undefined behavior according to the standard because val is the active member of the union and only the active member may be read. Type punning with unions is not possible. However compilers typically support this as an extension, because it has defined behavior in C, but even then layout of bit-fields is implementation-defined and e.g. the order in which bit fields are packed into bytes is implementation-dependent, and so is the result of reading these inactive members.

    Also, assuming this is not a C API, typedef union MIDI_EVENT_type_u { /*...*/ } MIDI_EVENT_type_u; is completely redundant in C++. Just union MIDI_EVENT_type_u { /*...*/ }; is enough.