Why does calling std::string.c_str() on a function that returns a string not work?

I have the following code:

std::string getString() {
    std::string str("hello");
    return str;
}

int main() {
    const char* cStr = getString().c_str();
    std::cout << cStr << std::endl; // this prints garbage
}

What I thought would happen is that getString() would return a copy of str (getString() returns by value); thus, the copy of str would stay "alive" in main() until main() returns. This would make cStr point to a valid memory location: the underlying char[] or char* (or whatever) of the copy of str returned by getString() which, remains in main().

However, this is obviously not the case, as the program outputs garbage. So, the question is, when is str destroyed, and why?

Solution

getString() would return a copy of str (getString() returns by value);

It's right.

thus, the copy of str would stay "alive" in main() until main() returns.

No, the returned copy is a temporary std::string, which will be destroyed at the end of the statement in which it was created, i.e. before std::cout << cStr << std::endl;. Then cStr becomes dangled, dereference on it leads to UB, anything is possible.

You can copy the returned temporary to a named variable, or bind it to a const lvalue-reference or rvalue-reference (the lifetime of the temporary will be extended until the reference goes out of scope). Such as:

std::string s1 = getString();    // s1 will be copy initialized from the temporary
const char* cStr1 = s1.c_str();
std::cout << cStr1 << std::endl; // safe

const std::string& s2 = getString(); // lifetime of temporary will be extended when bound to a const lvalue-reference
const char* cStr2 = s2.c_str();
std::cout << cStr2 << std::endl; // safe

std::string&& s3 = getString();  // similar with above
const char* cStr3 = s3.c_str();
std::cout << cStr3 << std::endl; // safe

Or use the pointer before the temporary gets destroyed. e.g.

std::cout << getString().c_str() << std::endl;  // temporary gets destroyed after the full expression

Here is an explanation from [The.C++.Programming.Language.Special.Edition] 10.4.10 Temporary Objects [class.temp]]:

Unless bound to a reference or used to initialize a named object, a temporary object is destroyed at the end of the full expression in which it was created. A full expression is an expression that is not a subexpression of some other expression.

The standard string class has a member function c_str() that returns a C-style, zero-terminated array of characters (§3.5.1, §20.4.1). Also, the operator + is defined to mean string concatenation. These are very useful facilities for strings . However, in combination they can cause obscure problems. For example:
void f(string& s1, string& s2, string& s3)
{

    const char* cs = (s1 + s2).c_str();
    cout << cs ;
    if (strlen(cs=(s2+s3).c_str())<8 && cs[0]==´a´) {
        // cs used here
    }

}
Probably, your first reaction is "but don’t do that," and I agree. However, such code does get written, so it is worth knowing how it is interpreted.

A temporary object of class string is created to hold s1 + s2 . Next, a pointer to a C-style string is extracted from that object. Then – at the end of the expression – the temporary object is deleted. Now, where was the C-style string allocated? Probably as part of the temporary object holding s1 + s2 , and that storage is not guaranteed to exist after that temporary is destroyed. Consequently, cs points to deallocated storage. The output operation cout << cs might work as expected, but that would be sheer luck. A compiler can detect and warn against many variants of this problem.