Search code examples
c++stringperformanceconstruction

Understanding the efficiency of an std::string


I'm trying to learn a little bit more about c++ strings.

consider

const char* cstring = "hello";
std::string string(cstring);

and

std::string string("hello");

Am I correct in assuming that both store "hello" in the .data section of an application and the bytes are then copied to another area on the heap where the pointer managed by the std::string can access them?

How could I efficiently store a really really long string? I'm kind of thinking about an application that reads in data from a socket stream. I fear concatenating many times. I could imagine using a linked list and traverse this list.

Strings have intimidated me for far too long!

Any links, tips, explanations, further details, would be extremely helpful.


Solution

  • I have stored strings in the 10's or 100's of MB range without issue. Naturally, it will be primarily limited by your available (contiguous) memory / address space.

    If you are going to be appending / concatenating, there are a few things that may help efficiency-wise: If possible, try to use the reserve() member function to pre-allocate space-- even if you have a rough idea of how big the final size might be, it would save from unnecessary re-allocations as the string grows.

    Additionally, many string implementations use "exponential growth", meaning that they grow by some percentage, rather than fixed byte size. For example, it might simply double the capacity any time additional space is needed. By increasing size exponentially, it becomes more efficient to perform lots of concatenations. (The exact details will depend on your version of stl.)

    Finally, another option (if your library supports it) is to use rope<> template: Ropes are similar to strings, except that they are much more efficient when performing operations on very large strings. In particular, "ropes are allocated in small chunks, significantly reducing memory fragmentation problems introduced by large blocks". Some additional details on SGI's STL guide.