Search code examples
cfloating-pointbyteieee-754

Correct way of reading bytes from IEEE754 floating point format


I have a requirement where I need to read the 4 raw bytes of the single precision IEEE754 floating point representation as to send on the serial port as it is without any modification. I just wanted to ask what is the correct way of extracting the bytes among the following:

1.) creating a union such as:

typedef union {
  float f;
  uint8_t bytes[4];
  struct {
    uint32_t mantissa : 23;
    uint32_t exponent : 8;
    uint32_t sign : 1;
  };
} FloatingPointIEEE754_t ;

and then just reading the bytes[] array after writing to the float variable f?

2.) Or, extracting bytes by a function in which a uint32_t type pointer is made to point to the float variable and then the bytes are extracted via masking

uint32_t extractBitsFloat(float numToExtFrom, uint8_t numOfBits, uint8_t bitPosStartLSB){
  uint32_t *p = &numToExtFrom;
  /* validate the inputs */
  if ((numOfBits > 32) || (bitPosStartLSB > 31)) return NULL;
  /* build the mask */
  uint32_t mask = ((1 << numOfBits) - 1) << bitPosStartLSB;
  return ((*p & mask) >> bitPosStartLSB);
}

where calling will be made like:

valF = -4.235;
byte0 = extractBitsFloat(valF, 8, 0);
byte1 = extractBitsFloat(valF, 8, 8);
byte2 = extractBitsFloat(valF, 8, 16);
byte3 = extractBitsFloat(valF, 8, 24);

Please suggest me the correct way if you think both the above-mentioned methods are wrong!


Solution

  • First of all, I assume you're coding specifically for a platform where float actually is represented in a IEEE754 single. You can't take this for granted in general, so your code won't be portable to all platforms.

    Then, the union approach is the correct one. But don't add this bitfield member! There's no guarantee how the bits will be arranged, so you might access the wrong bits. Just do this:

    typedef union {
      float f;
      uint8_t bytes[4];
    } FloatingPointIEEE754;
    

    Also, don't add a _t suffix to your own types. On POSIX systems, this is reserved to the implementation, so it's best to always avoid it.

    Instead of using a union, accessing the bytes through a char pointer is fine as well:

    unsigned char *rep = (unsigned char *)&f;
    // access rep[0] to rep[3]
    

    Note in both cases, you are accessing the representation in memory, this means you have to pay attention to the endianness of your machine.


    Your second option isn't correct, it violates the strict aliasing rule. In short, you're not allowed to access an object through a pointer that doesn't have compatible type, a char pointer is an explicit exception for accessing the representation. The exact rules are written in 6.5 p7 of N1570, the latest draft to the C11 standard.