Search code examples
c++memcpyreinterpret-casttype-punning

How to do type punning correctly in C++


Let's say I have this code:

//Version 1
#include <iostream>
#include <cstdint>

int main()
{
    uint32_t bits{0x3dfcb924}; //bits describe "0.1234" as IEEE 754 floating point
    float num {*((float*) &bits)};
    std::cout << num << std::endl;
}

All I want is to interpret the bits from the bits variable as a float. I came to understand that this is called "type punning".

The above code currently works on my machine with GCC 10 on Linux.

I have used this method to "reinterpret bits" for quite some time. However, recently I learned about the "strict aliasing rule" from this post:

What is the strict aliasing rule?

What I took away from there: Two pointers that point to objects of different types (for example uint32_t* and float*) produce undefined behaviour. So... is my code example above undefined behaviour?

I searched for a way to do it "correctly" and came across this post:

What is the modern, correct way to do type punning in C++?

The accepted answer just tells us "just use std::memcpy" and if the compiler supports it (mine doesn't) use "std::bit_cast"

I have also searched some other forums and read through some lengthy discussions (most of which were above my level of knowledge) but most of them agreed: Just use std::memcpy.

So... do I do it like this instead?

//Version 2
#include <iostream>
#include <cstdint>
#include <cstring>

int main()
{
    uint32_t bits{0x3dfcb924}; 
    float num {};
    std::memcpy(&num, &bits, sizeof(bits));
    std::cout << num << std::endl;
}

Here, &num and &bits are implicitly converted to a void-pointer, right? Is that ok?

Still... is version 1 REALLY undefined behaviour? I mean to recall some source (which I unfortunately can't link here because I can't find it again) said that the strict aliasing rule only applies when you try to convert to a class type and that reinterpreting between fundamental types is fine. Is this true or total nonsense?

Also... in version 1 I use C-style casting to convert a uint32_t* to a float*. I recently learned that C-style casting will just attempt the various types of C++ casts in a certain order (https://en.cppreference.com/w/cpp/language/explicit_cast). Also, I heard I should genereally avoid C-style casts for that reason.

So IF version 1 was fine, would it be better to just do it like this instead?

//Version 3
#include <iostream>
#include <cstdint>

int main()
{
    uint32_t bits{0x3dfcb924};
    float num {*reinterpret_cast<float*>(&bits)};
    std::cout << num << std::endl;
}

From my understanding, reinterpret_cast is used to convert some pointer to type A to some pointer to type B, "reintepreting" the underlying bits in the process, which is exactly what I want to do. I believed that version 1 did exactly this anyway since the C-style cast will detect that and automatically convert this to a reintepret_cast. If that was the case, Version 1 and Version 3 would be identical since they both do reinterpret_casts, only that Version 3 does so explicitly. Is that correct?

So... which one should I use? Version 1, Version 2 or Version 3? And why?

All three versions seem to work on my machine by the way.

EDIT: Forgot to mention... if Version 3 WAS undefined behaviour, what is the point of reinterpret_cast then anyway? I looked at this post:

When to use reinterpret_cast?

But I didn't really find an answer that I understood. So... what is reinterpret_cast good for then?


Solution

  • None of them. Use std::bit_cast instead. UB is UB. You can't trust it will work "next time".

    #include <iostream>
    #include <cstdint>
    #include <bit>
    
    int main() {
        uint32_t bits{0x3dfcb924}; //bits describe "0.1234" as IEEE 754 floating point
        float num = std::bit_cast<float>(bits);
        std::cout << num << std::endl;
    }