Search code examples
c++c-strings

Inspect the returning value of std::string.c_str()


In C++ Primer 5th Edition, it says:

The Array returned by c_str is not guaranteed to be valid indefinitely.

So I did a test:

//  c_str exploration
std::string strTest = "This is a test";
const char* s1 = strTest.c_str();
strTest = "This is b test";
std::cout << s1 << std::endl;

Since s1 is a pointer, it definitely shows the new value. However when I change the value to a string of different length, it usually shows some garbage:

//  c_str exploration
std::string strTest = "This is a test";
const char* s1 = strTest.c_str();
strTest = "This is b testsssssssssssssssssssssssssss";
std::cout << s1 << std::endl;

I figured that it is because the returned C String already fixed the position of the ending null character, so when the length changes it invalidate everything. To my surprise, sometimes it is still valid even after I change the string to a new length:

//  c_str exploration
std::string strTest = "This is a test";
const char* s1 = strTest.c_str();
strTest = "This is b tests";     // Note the extra s at the end
std::cout << s1 << std::endl;

Second question:

I'm also not sure why std::cout << s1 prints the content instead of the address of the C String. While the following code prints the address of the Integer as I expected:

int dim = 42;
int* pdim = &dim;
std::cout << pdim << std::endl;

This prints out the character 'T', as expected:

std::cout << *s1 << std::endl;

My assumption is that std::cout does an auto convert, but please lecture me more on this.


Solution

  • First Question

    The pointer returned by std::c_str() remain valid if the string is not modified. From cppreference.com:

    The pointer obtained from c_str() may be invalidated by:

    • Passing a non-const reference to the string to any standard library function, or
    • Calling non-const member functions on the string, excluding operator[], at(), front(), back(), begin(), rbegin(), end() and rend().

    In your posted code,

    std::string strTest = "This is a test";
    const char* s1 = strTest.c_str();
    strTest = "This is b tests";  // This line makes the pointer invalid.
    

    and then use of the pointer to access the string is undefined behavior.

    std::cout << s1 << std::endl; // Undefined behavior.
    

    After that, it's pointless to try to make sense of what the code does.

    Second Question

    The standard library provides an operator overload function between std::ostream and char const* so C-style strings can be printed in a sensible way. When you use:

    std::cout << "Hello, World.";
    

    you would want to see Hello, World. as output, not the value of the pointer that points to that string.

    For reasons beyond the scope of this answer, that function overload is implemented as a non-member function.

    template< class CharT, class Traits >
    basic_ostream<CharT,Traits>& operator<<( basic_ostream<CharT,Traits>& os, 
                                             const CharT* s );
    

    After all the template related tokens are substituted, that line translates to:

    std::ostream& operator<<(std::ostream& os, const char* s );
    

    You can see the list of non-member overload functions at cppreference.com.