Search code examples
c++stringundefined-behaviormemcpy

Convert built-in data type to std::string: memcpy vs snprintf


I have referred to the relevant question and other posts before this. I am also aware that std::to_string() is the best way (but it's not available in few platforms).

While experimenting, I came across a weird issue with memcpy(). For the example sake, assume that we always pass built-in data types (int, char, long) to below function:

template<typename T>
std::string to_string (const T& value)
{
  std::string s(16, 0); // Max size captured
  ::memcpy(&s[0], &value, sizeof(value));
  return s;
}

Running this function individually in a sample program works fine. But while plugging into a bigger code base, somehow it gives weird results! i.e. it gives spurious values. (Ubuntu 14.10, g++4.9 -std=c++11)

However, if I convert the above program using sprintf(), it works fine.

template<typename T>
std::string to_string (const T& value)
{
  std::string s(16, 0); // Max size captured
  s[::snprintf(&s[0], "%d", value)] = 0;
  return s;
}

Question:

  1. Am I touching undefined behavior with memcpy() (or even sprintf())?
  2. Would byte ordering influence this code?

Solution

  • To recap, yes, you do not want to use memcpy(). Using snprintf() you avoid having to convert the number to ASCII yourself. Something like this would probably be preferable though:

    template<typename T>
    std::string to_string (const T& value)
    {
      char buf[16];
      ::snprintf(buf, sizeof(buf), "%d", value);
                   // ^-- size was missing in your example
      return buf;
    }
    

    However, you have a big flow in this function because you cannot know what T is going to be. It could be a double and "%d" won't work as expected. Similarly, it could be a string (char const *).

    If you want to manually convert a number to ASCII you can use a loop, something like this:

    template<typename T>
    std::string to_string (T value)
    {
      char buf[16]; // any int number is less than 16 characters
      char *s = buf + sizeof(buf);
      *--s = '\0';
      do
      {
        *--s = value % 10 + '0';  // conversion to ASCII, 1 digit at a time
        value /= 10;
      }
      while(value > 0);
      return s;
    }
    

    WARNING: that function does not properly handle negative numbers. I'll let that one as an exercise for you to handle as required.

    Now, if you want to use a C++ way that should work on all systems you mentioned, without boost or C++11.

    template<typename T>
    std::string to_string (T const& value)
    {
      std::stringstream ss;
      ss << value;
      return ss.str();
    }
    

    In this case the stringstream knows how to handle T whatever T is, numbers, objects, etc. as long as those things understand the << as in std::cout << "Hello!" << std::endl;.

    If you check out one of my project, named as2js, you'd see a file named include/as2js/node.h which declare something like this:

    std::ostream& operator << (std::ostream& out, Node const& node);
    

    That means you can later create a node and print it like this:

    Node n;
    std::out << n << std::endl;
    

    This means your to_string() function would work with my Node objects.

    You can find the implementation of all of that under lib/node_display.cpp