Search code examples
c++c++20stdstringstring-viewstd-variant

How to store either std::string or std::string_view in a std::variant?


I am working on a lexer. I have a Token struct, which looks like this:

struct Token {
    enum class Type { ... };
    
    Type type;
    std::string_view lexeme;
}

The Token's lexeme is just a view to a small piece of the full source code (which, by the way, is also std::string_view).

The problem is that I need to re-map special characters (for instance, '\n'). Storing them as-is isn't a nice solution.

I've tried replacing lexeme's type with std::variant<std::string, std::string_view>, but it has quickly become spaghetti code, as every time I want to read the lexeme (for example, to check if the type is Bool and lexeme is "true") it's a big pain.

Storing lexeme as an owning string won't solve the problem.

By the way, I use C++20; maybe there is a nice solution for it?


Solution

  • You could just use std::string

    Firstly, a std::string could be used in a Token just as well as a std::string_view. This might not be as costly as you think, because std::string in all C++ standard libraries has SSOs (small string optimizations).

    This means that short tokens like "const" wouldn't be allocated on the heap; the characters would be stored directly inside the container. Before bothering with std::string_view and std::variant, you might want to measure whether allocations are even being a performance issue. Otherwise, this is a case of premature optimization.

    If you insist on std::variant ...

    User @Homer512 has provided a solid solution already. Rather than using the std::variant directly, you could create a wrapper around it which provides a string-like interface for both std::string and std::string_view.

    This is easy to do, because the name and meaning of most member functions is identical for both classes. That also makes them easy to use through std::visit.

    struct MaybeOwningString
    {
        using variant_type = std::variant<std::string, std::string_view>;
        using size_type = std::string_view::size_type;
    
        variant_type v;
    
        // main member function which grants access to either alternative as a view
        std::string_view view() const noexcept {
            return std::visit([](const auto& str) -> std::string_view {
                return str;
            }, v);
        }
    
        // various helper functions which expose commonly used member functions
        bool empty() const noexcept {
            // helper functions can be implemented with std::visit, but this is verbose
            return std::visit([](const auto& str) {
                return str.empty();
            }, v);
        }
    
        size_type size() const noexcept {
            // helper functions can also be implemented by using view()
            return view().size();
        }
    
        // ...
    };