Search code examples
c++stringstlconstantsimplicit-constructor

Is implicit construction of `const std::string` from `const char *` efficient?


Like many people I'm in the habit of writing new string functions as functions of const std::string &. The advantages are efficiency (you can pass existing std::string objects without incurring overhead for copying/moving) and flexibility/readability (if all you have is a const char * you can just pass it and have the construction done implicitly, without cluttering up your code with an explicit std::string construction):

#include <string>
#include <iostream>
unsigned int LengthOfStringlikeObject(const std::string & s)
{
    return s.length();
}
int main(int argc, const char * argv[])
{
    unsigned int n = LengthOfStringlikeObject(argv[0]);
    std::cout << "'" << argv[0] << "' has " << n << " characters\n";
}

My aim is to write efficient cross-platform code that can handle long strings efficiently. My question is, what happens during the implicit construction? Are there any guarantees that the string will not be copied? It strikes me that, because everything is const, copying is not necessary—a thin STL wrapper around the existing pointer is all that's needed—but I'm not sure how compiler- and platform-dependent I should expect that behavior to be. Would it be safer to always explicitly write two versions of the function, one for const std::string & and one for const char *?


Solution

  • It strikes me that, because everything is const, copying is not necessary—a thin STL wrapper around the existing pointer is all that's needed

    I don't think this assumption is correct. Just because you have a pointer to const, it does not imply that the underlying value cannot change. It only implies that the value cannot be changed through that pointer. The pointer could be pointing to non-const storage which can change at any time.

    Because of this, the library must make its own copy (to provide the "correct" string observable behavior). A quick review of libstdc++ shows that it always makes a copy. The construction from char* is not inline, so it cannot be optimized away without static linking and LTO.

    While extremely trivial statically linked programs might have the copy optimized away with LTO (I wasn't able to reproduce this), I think in general it would be unlikely this optimization could be performed (especially considering the aliasing rules for char*). g++ doesn't even perform this optimization for a string literal.