Search code examples
c++c++20string-view

Tokenize a string with std::views


I want to iterate over token which I have extracted from the std::wcin stream, using the functions and classes of the C++ std library. I tried this:

    auto input = std::views::istream<wchar_t>(std::wcin);
    auto tokens = input | std::views::lazy_split(L' ');

    for (auto const& token : tokens)
    {
      std::wstring s1(token); // doesn't compile
      std::wstring s;
      s = token; // doesn't compile
      s = std::wstring(token); // doesn't compile
      s = std::wstring{token.begin(), token.end()}; // doesn't compile
      s = std::wstring(token.begin(), token.end()); // doesn't compile
      s = std::wstring(token.cbegin(), token.cend()); // doesn't compile
      // action goes here - requires the token to be a string
      do_something_with_the_token_string(s);
    }

Unfortunately, I didn't find a way to convert token into a wstring. How is it done? According to the error messages, the type of token is std::ranges::lazy_split_view<std::ranges::basic_istream_view<wchar_t,_Elem,_Traits>,std::ranges::single_view<wchar_t>>::_Outer_iter<false>::value_type.

Using Visual Studio 17.9.1 with /std:c++20.


Solution

  • If you are allowed to use /std:c++latest instead of /std:c++20 you can use the C++23 feature std::ranges::to:

    auto input = std::views::istream<wchar_t>(std::wcin >> std::noskipws);
    auto tokens = input | std::views::lazy_split(L' ');
    
    for (auto const token : tokens) {
        auto s1 = std::ranges::to<std::wstring>(token);
        // ...
    }
    

    Note: I assume that you've set std::noskipws in std::wcin in code that you've not shown, but I included that above just in case you haven't. Without it, there will be no spaces in the view to make the split on.


    In C++20 I think you will have to settle for copying:

    for (auto const& token : tokens) {
        std::wstring s1;
        std::ranges::copy(token, std::back_inserter(s1));
        // ...
    }