Search code examples
c++hexiomanip

std::hex cannot process negative numbers?


I'm trying to use std::hex to read hexadecimal integers from a file.

0
a
80000000
...

These integers are both positive and negative.

It seems that std::hex cannot handle negative numbers. I don't understand why, and I don't see a range defined in the docs.

Here is a test bench:

#include <iostream>
#include <sstream>
#include <iomanip>

int main () {

  int i;
  std::stringstream ss;

  // This is the smallest number
  // That can be stored in 32 bits -1*2^(31)
  ss << "80000000";

  ss >> std::hex >> i;

  std::cout << std::hex << i << std::endl;

}

Output:

7fffffff

Solution

  • Setting std::hex tells the stream to read integer tokens as though using std::scanf with the %X formatter. %X reads into an unsigned integer, and the resulting value would overflow an int even through the bit pattern fits. Because of the overflow, the read fails, and the contents of i cannot be trusted to hold the expected value. Side note: i will be set to 0 if compiling to C++11 or more recent or unchanged from its current unspecified value before c++11.

    Note that if we check the stream state after the read, something you should ALWAYS do, we can see that the read failed:

    #include <iostream>
    #include <sstream>
    #include <iomanip>
    #include <cstdint> // added for fixed width integers.
    int main () {
    
      int32_t i; //ensure 32 bit int
      std::stringstream ss;
    
      // This is the smallest number
      // That can be stored in 32 bits -1*2^(31)
      ss << "80000000";
    
      if (ss >> std::hex >> i)
      {
          std::cout << std::hex << i << std::endl;
      }
      else
      {
          std::cout << "FAIL! " <<  std::endl; //will execute this
      }
    }
    

    The solution is, as the asker surmised in the comments to read into an unsigned int (uint32_t to avoid further surprises if int is not 32 bits). The following is the zero-surprises version of the code using memcpy to transfer the exact bit pattern read into i.

    #include <iostream>
    #include <sstream>
    #include <iomanip>
    #include <cstdint> // added for fixed width integers.
    #include <cstring> //for memcpy
    int main () {
    
      int32_t i; //ensure 32 bit int
      std::stringstream ss;
    
      // This is the smallest number
      // That can be stored in 32 bits -1*2^(31)
      ss << "80000000";
    
      uint32_t temp;
      if (ss >> std::hex >> temp)
      {
          memcpy(&i, &temp, sizeof(i));// probably compiles down to cast
          std::cout << std::hex << i << std::endl;
      }
      else
      {
        std::cout << "FAIL! " <<  std::endl; 
      }
    }
    

    That said, diving into old-school C-style coding for a moment

      if (ss >> std::hex >> *reinterpret_cast<uint32_t*>(&i))
      {
        std::cout << std::hex << i << std::endl;
      }
      else
      {
        std::cout << "FAIL! " <<  std::endl; 
      }
    

    violates the strict aliasing rule, but I'd be stunned to see it fail once 32 bit int is forced with int32_t i;. This might even be legal in more recent C++ Standards as being "Type Similar", but I'm still wrapping my head around that.