Search code examples
c++splitstdc++23string-view

How to extract std::string_view tokens from std::ranges::lazy_split_view using >= C++23?


In a compile-time parser, I need to extract parts of a std::string_value literal and decode them.

A minimal, unfortunately not compiling, example of what I do is:

#include <charconv>
#include <iostream>
#include <optional>
#include <ranges>
#include <string_view>

template <typename X>
static constexpr X svtox(const std::string_view &sv)
{
    X value;
    auto result = std::from_chars(sv.data(), sv.data() + sv.size(), value);
    return value;
}

static constexpr float parse(const std::string_view &sv)
{
    const std::ranges::lazy_split_view tokens(sv, " ");
    auto token_iterator = tokens.cbegin();
    const auto keyword = *token_iterator++;
    return svtox<float>(*token_iterator++);
}

int main()
{
    const auto sv = std::string_view{"Whatever 1.0"};
    float f = parse(sv);
    std::cout << f;
}

For sure the real patterns are more complex and also more gets extracted, error handling is included and further processed instead of just printed.

When I call GCC 13.2 with c++ -std=gnu++2b -Wall -o test test.cpp, I get the following errors:

test.cpp: In function ‘constexpr float parse(const std::string_view&)’:
test.cpp:20:25: error: invalid initialization of reference of type ‘const std::string_view&’ {aka ‘const std::basic_string_view<char>&’} from expression of type ‘std::basic_const_iterator<std::ranges::lazy_split_view<std::basic_string_view<char>, std::ranges::ref_view<const char [2]> >::_OuterIter<true> >::__reference’ {aka ‘std::__common_reference_impl<const std::ranges::lazy_split_view<std::basic_string_view<char>, std::ranges::ref_view<const char [2]> >::_OuterIter<true>::value_type&&, std::ranges::lazy_split_view<std::basic_string_view<char>, std::ranges::ref_view<const char [2]> >::_OuterIter<true>::value_type, 3, void>::type’}
   20 |     return svtox<float>(*token_iterator++);
      |                         ^~~~~~~~~~~~~~~~~
test.cpp:8:50: note: in passing argument 1 of ‘constexpr X svtox(const std::string_view&) [with X = float; std::string_view = std::basic_string_view<char>]’
    8 | static constexpr X svtox(const std::string_view &sv)
      |                          ~~~~~~~~~~~~~~~~~~~~~~~~^~

The environment I am in can be recreated using the following Dockerfile (simplified example):

FROM mcr.microsoft.com/devcontainers/cpp:1-ubuntu

RUN apt-get update && export DEBIAN_FRONTEND=noninteractive && apt-get -y dist-upgrade && apt-get install -y ubuntu-release-upgrader-core && do-release-upgrade -p -f DistUpgradeViewNonInteractive -m server --allow-third-party && apt-get -y dist-upgrade && do-release-upgrade -d -f DistUpgradeViewNonInteractive -m server --allow-third-party && apt-get -y dist-upgrade

I am fairly new to modern (means for me after C++98) C++, so I guess I miss something simple.


Solution

  • The subranges split by lazy_split_view only model forward_range at most, which is why it is called "lazy". Constructing string_view requires contiguous_range.

    Instead, you can use split_view and explicitly transform the split subranges to string_view:

    auto tokens = sv | std::views::split(' ')
                     | std::views::transform([](auto r) { return std::string_view(r); });
    auto token_iterator = tokens.cbegin();
    

    Note that compared to lazy_split_view, split_view is not const-iterable, so you cannot declare it as a const object.

    Demo