I have a vector of strings as my primary data container. However, in order to interoperate with a C library, I need to be able to view these strings as character data pointers (i.e. const char*
). This sounds simple enough, so I wrote a helper class like this:
class strvecAccessor {
const std::vector<std::string>& names;
public:
strvecAccessor(const std::vector<std::string>& a) : names(a) {}
size_t size() const {
return names.size();
}
const char* item(size_t i) {
auto name = names[i];
return name.data();
}
};
This accessor class is short-lived. It serves as a wrapper around an existing vector of strings, which is guaranteed to not be modified or go out-of-scope during the lifespan of this class. An example of how this class can be used is the following:
void test(strvecAccessor& arr) {
for (size_t i = 0; i < arr.size(); ++i) {
printf("%s\n", arr.item(i));
}
}
But there is a bug in this code, which manifests itself only when I compile it in --coverage -O0
mode, and only on Unix machine (I compile with CLang 6.0.0 in C++11 compatibility mode). The bug is that the printed strings contain garbage.
I believe what happens is that the name
variable in item()
method is not a reference, but a copy of the i
-th element of the array. It goes out-of-scope at the end of the item()
function, at which point the pointer that was returned becomes dangling. Most of the time it is not noticeable since the pointer is used immediately, but in coverage mode it gets filled up with other data right after the call.
The problem disappears if I replace auto name = names[i];
with const std::string& name = names[i];
. But I don't really understand why, and whether this actually solves the problem or just buries it deeper. So my question is: why the copy is being made in the original code; and how to protect myself against these kinds of errors in the future?
const char* item(size_t i) {
auto name = names[i];
return name.data();
}
Here, name
is a local variable to the function item()
, and you're returning an address to data that is owned by that local variable. When it goes out of scope (the item()
function completes) name
will be destroyed.
Since you're guaranteeing the lifetime of the underlying vector, try this instead:
const char* item(size_t i) {
return names[i].data();
}
This will be "safe", because vector::operator[]
returns a reference to the stored data, and you don't make an extraneous copy into your name
variable like in your original.