Search code examples
c++c++11stdstring

Are there downsides to using std::string as a buffer?


I have recently seen a colleague of mine using std::string as a buffer:

std::string receive_data(const Receiver& receiver) {
  std::string buff;
  int size = receiver.size();
  if (size > 0) {
    buff.resize(size);
    const char* dst_ptr = buff.data();
    const char* src_ptr = receiver.data();
    memcpy((char*) dst_ptr, src_ptr, size);
  }
  return buff;
}

I guess this guy wants to take advantage of auto destruction of the returned string so he needs not worry about freeing of the allocated buffer.

This looks a bit strange to me since according to cplusplus.com the data() method returns a const char* pointing to a buffer internally managed by the string:

const char* data() const noexcept;

Memcpy-ing to a const char pointer? AFAIK this does no harm as long as we know what we do, but have I missed something? Is this dangerous?


Solution

  • Don't use std::string as a buffer.

    It is bad practice to use std::string as a buffer, for several reasons (listed in no particular order):

    • std::string was not intended for use as a buffer; you would need to double-check the description of the class to make sure there are no "gotchas" which would prevent certain usage patterns (or make them trigger undefined behavior).
    • As a concrete example: Before C++17, you can't even write through the pointer you get with data() - it's const Tchar *; so your code would cause undefined behavior. (But &(str[0]), &(str.front()), or &(*(str.begin())) would work.)
    • Using std::strings for buffers is confusing to readers of your function's definition, who assume you would be using std::string for, well, strings. In other words, doing so breaks the Principle of Least Astonishment.
    • Worse yet, it's confusing for whoever might use your function - they too may think what you're returning is a string, i.e. valid human-readable text.
    • std::unique_ptr would be fine for your case, or even std::vector. In C++17, you can use std::byte for the element type, too. A more sophisticated option is a class with an SSO-like feature, e.g. Boost's small_vector (thank you, @gast128, for mentioning it).
    • (Minor point:) libstdc++ had to change its ABI for std::string to conform to the C++11 standard, so in some cases (which by now are rather unlikely), you might run into some linkage or runtime issues that you wouldn't with a different type for your buffer.

    Also, your code may make two instead of one heap allocations (implementation dependent): Once upon string construction and another when resize()ing. But that in itself is not really a reason to avoid std::string, since you can avoid the double allocation using the construction in @Jarod42's answer.