Search code examples
c++c++17string-view

Why is `std::string_view` not implemented differently?


Given the following code we can see that std::string_view is invalidated when string grows beyond capacity (here SSO is in effect initially then contents are put on the heap)

#include <iostream>
#include <cassert>
#include <string>

using std::cout;
using std::endl;

int main() {
    std::string s = "hi";
    std::string_view v = s;

    cout << v << endl;

    s = "this is a long long long string now";

    cout << v << endl;
}

output:

hi
#

so if I store a string_view to a string then change the contents of the string I can be in big trouble. Would it be possible, given the existing std::string implementations to make a smarter string_view? which would not face such a drawback? We could store a pointer to the string object itself and then determine if the string is in SSO more or not and work accordingly.(Not sure how this would work with literal strings though, so maybe that is why it was not done this way?)

I am aware that string_view is akin to storing the return value of string::c_str() but given we have this wrapper around std::string I do not think this gotcha would occur to a lot of people using this feature. Most disclaimers are to make sure the pointed to std::string is within scope but this is a different issue altogether.


Solution

  • string_view knows nothing about string. It is not a "wrapper" around a string. It has no idea that std::string even exists as a type; the conversion from string to string_view happens within std::string. string_view has no association with or reliance on std::string.

    In fact, that is the entire purpose of string_view: to be able to have a non-modifiable sized string without knowing how it is allocated or managed. That it can reference any string type that stores its characters contiguously is the point of the thing. It allows you to create an interface that takes a string_view without knowing or caring whether the caller is using std::string, CString, or any other string type.

    Since the owning string's behavior is not string_view's business, there is no possible mechanism for string_view to be told when the string it references is no longer valid.


    We could store a pointer to the string object itself and then determine if the string is in SSO more or not and work accordingly.

    For the sake of argument, let us ignore that string_view is not supposed to know or care whether its characters come from std::string. Let's assume that string_view only works with std::string (even though that makes the type completely worthless).

    Even then, this would not work. Or rather, it would only work if the type was functionally no different from a std::string const&.

    If string_view stores a pointer to the first character and a size, then any modification to the std::string might change this. It could change the size even without breaking small-string optimization. It could change the size without causing reallocation. The only way to correct this is to have the string_view always ask the std::string it references what its character data and size are.

    And that's no different from just using a std::string const& directly.