Search code examples
c++splitstring-literalsstd-rangesc++23

Why does std::views::split() compile but not split with an unnamed string literal as a pattern?


When std::views::split() gets an unnamed string literal as a pattern, it will not split the string but works just fine with an unnamed character literal.

#include <iomanip>
#include <iostream>
#include <ranges>
#include <string>
#include <string_view>

int main(void)
{
    using namespace std::literals;

    // returns the original string (not splitted)
    auto splittedWords1 = std::views::split("one:.:two:.:three", ":.:");
    for (const auto word : splittedWords1)
        std::cout << std::quoted(std::string_view(word));
    
    std::cout << std::endl;

    // returns the splitted string
    auto splittedWords2 = std::views::split("one:.:two:.:three", ":.:"sv);
    for (const auto word : splittedWords2)
        std::cout << std::quoted(std::string_view(word));
    
    std::cout << std::endl;

    // returns the splitted string
    auto splittedWords3 = std::views::split("one:two:three", ':');
    for (const auto word : splittedWords3)
        std::cout << std::quoted(std::string_view(word));
    
    std::cout << std::endl;

    // returns the original string (not splitted)
    auto splittedWords4 = std::views::split("one:two:three", ":");
    for (const auto word : splittedWords4)
        std::cout << std::quoted(std::string_view(word));
    
    std::cout << std::endl;

    return 0;
}

See live @ godbolt.org.

I understand that string literals are always lvalues. But even though, I am missing some important piece of information that connects everything together. Why can I pass the string that I want splitted as an unnamed string literal whereas it fails (as-in: returns a range of ranges with the original string) when I do the same with the pattern?


Solution

  • String literals always end with a null-terminator, so ":.:" is actually a range with the last element of \0 and a size of 4.

    Since the original string does not contain such a pattern, it is not split.

    When dealing with C++20 ranges, I strongly recommend using string_view instead of raw string literals, which works well with <ranges> and can avoid the error-prone null-terminator issue.