I have the following code:
std::string getString() {
std::string str("hello");
return str;
}
int main() {
const char* cStr = getString().c_str();
std::cout << cStr << std::endl; // this prints garbage
}
What I thought would happen is that getString()
would return a copy of str
(getString()
returns by value); thus, the copy of str
would stay "alive" in main()
until main()
returns. This would make cStr
point to a valid memory location: the underlying char[]
or char*
(or whatever) of the copy of str
returned by getString()
which, remains in main()
.
However, this is obviously not the case, as the program outputs garbage. So, the question is, when is str
destroyed, and why?
getString()
would return a copy ofstr
(getString()
returns by value);
It's right.
thus, the copy of
str
would stay "alive" inmain()
untilmain()
returns.
No, the returned copy is a temporary std::string
, which will be destroyed at the end of the statement in which it was created, i.e. before std::cout << cStr << std::endl;
. Then cStr
becomes dangled, dereference on it leads to UB, anything is possible.
You can copy the returned temporary to a named variable, or bind it to a const
lvalue-reference or rvalue-reference (the lifetime of the temporary will be extended until the reference goes out of scope). Such as:
std::string s1 = getString(); // s1 will be copy initialized from the temporary
const char* cStr1 = s1.c_str();
std::cout << cStr1 << std::endl; // safe
const std::string& s2 = getString(); // lifetime of temporary will be extended when bound to a const lvalue-reference
const char* cStr2 = s2.c_str();
std::cout << cStr2 << std::endl; // safe
std::string&& s3 = getString(); // similar with above
const char* cStr3 = s3.c_str();
std::cout << cStr3 << std::endl; // safe
Or use the pointer before the temporary gets destroyed. e.g.
std::cout << getString().c_str() << std::endl; // temporary gets destroyed after the full expression
Here is an explanation from [The.C++.Programming.Language.Special.Edition] 10.4.10 Temporary Objects [class.temp]]:
Unless bound to a reference or used to initialize a named object, a temporary object is destroyed at the end of the full expression in which it was created. A full expression is an expression that is not a subexpression of some other expression.
The standard string class has a member function c_str() that returns a C-style, zero-terminated array of characters (§3.5.1, §20.4.1). Also, the operator + is defined to mean string concatenation. These are very useful facilities for strings . However, in combination they can cause obscure problems. For example:
void f(string& s1, string& s2, string& s3) { const char* cs = (s1 + s2).c_str(); cout << cs ; if (strlen(cs=(s2+s3).c_str())<8 && cs[0]==´a´) { // cs used here } }
Probably, your first reaction is "but don’t do that," and I agree. However, such code does get written, so it is worth knowing how it is interpreted.
A temporary object of class string is created to hold s1 + s2 . Next, a pointer to a C-style string is extracted from that object. Then – at the end of the expression – the temporary object is deleted. Now, where was the C-style string allocated? Probably as part of the temporary object holding s1 + s2 , and that storage is not guaranteed to exist after that temporary is destroyed. Consequently, cs points to deallocated storage. The output operation cout << cs might work as expected, but that would be sheer luck. A compiler can detect and warn against many variants of this problem.