I got this problem from a friend
#include <string>
#include <vector>
#include <iostream>
void riddle(std::string input)
{
auto strings = std::vector<std::string>{};
strings.push_back(input);
auto raw = strings[0].c_str();
strings.emplace_back("dummy");
std::cout << raw << "\n";
}
int main()
{
riddle("Hello world of!"); // Why does this print garbage?
//riddle("Always look at the bright side of life!"); // And why doesn't this?
std::cin.get();
}
My first observation is that the riddle()
function will not produce garbage when the number of words passed into input
is more than 3 words. I am still trying to see why it fails for the first case and not for the second case. Anyways thought this was be fun to share.
This is undefined behavior (UB), meaning that anything can happen, including the code working.
It is UB because the emplace_back
invalidates all pointers into the objects in the vector. This happens because the vector may be reallocated (which apparently it is).
The first case of UB "doesn't work" because of short string optimization (sso). Due to sso the raw pointer points to the memory directly allocated by the vector, which is lost after reallocation.
The second case of UB "works" because the string text is too long for SSO and resides on an independent memory block. During resize the string object is moved from, moving the ownership of the memory block of the text to the newly created string object. Since the block of memory simply changes ownership, it remains valid after emplace_back
.