This code has undefined behavior:
#include <string_view>
#include <iostream>
using namespace std::string_view_literals;
void foo(std::string_view msg) {
std::cout << msg.data() << '\n'; // undefined behavior if 'msg' is not null-
// terminated
// std::cout << msg << '\n'; is not undefined because operator<< uses
// iterators to print 'msg', but that's not the point
}
int main() {
foo("hello"sv); // not null-terminated - undefined behavior
foo("foo"); // same, even more dangerous
}
The reason why is that std::string_view
can store non-null terminated strings, and doesn't include a null terminator when calling data
. That's really limiting, as to make the above code defined behavior, I have to construct a std::string
out of it:
std::string str{ msg };
std::cout << str.data() << '\n';
This really makes std::string_view
unnecessary in this case, I still have to copy the string passed to foo
, so why not use move semantics and change msg
to a std::string
? This might be faster, but I didn't measure.
Either way, having to construct a std::string
every time I want to pass a const char*
to a function which only accepts a const char*
is a bit unnecessary, but there has to be a reason why the Committee decided it this way.
So, why does std::string_view::data
not return a null-terminated string like std::string::data
?
So, why does std::string_view::data not return a null-terminated string like std::string::data
Simply because it can't. A string_view
can be a narrower view into a larger string (a substring of a string). That means that the string viewed will not necessary have the null termination at the end of a particular view. You can't write the null terminator into the underlying string for obvious reasons and you can't create a copy of the string and return char *
without a memory leak.
If you want a null terminating string, you would have to create a std::string
copy out of it.
Let me show a good use of std::string_view
:
auto tokenize(std::string_view str, Pred is_delim) -> std::vector<std::string_view>
Here the resulting vector contains tokens as views into the larger string.