Search code examples
c++c++20predicatestd-ranges

Why does std::views::take_while from the Ranges library require a const predicate?


TL;DR: I'm playing around with ranges and the corresponding range adaptors from the Ranges library. Both range adaptors std::views::take_while and std::views::filter take a predicate to exclude certain elements from the input sequence. Why does take_while take a const predicate while filter does not?

Background story

I have an std::vector<int> and want to iterate over it, but I want to stop iterating when hitting a 5. By using the range adaptor std::views::take_while I can implement that as follows:

std::vector<int> v { 8, 2, 5, 6 };

for (int i : v | std::views::take_while([](int i) { return i != 5; })) {
    std::cout << "Value: " << i << std::endl;
}

Output:

Value: 8
Value: 2

However, I now want to process the 5 as well, so the loop must run one iteration step further. I didn't find a suitable range adaptor, so I wrote the following stateful lambda expression:

auto cond = [b = true](int i) mutable {
    return b ? b = (i != 5), true : false;
};

This lambda expression remembers when the condition i != 5 is violated and returns false on the next call. I then passed it to the std::views::take_while as follows:

for (int i : v | std::views::take_while(cond)) {
    std::cout << "Value: " << i << std::endl;
}

However, for the above code, the compiler throws a long error message. Since I couldn't spot the problem, I closely inspected the declaration of std::views::take_while and found that the predicate Pred must be const. Looking for an alternative, I checked the declaration of std::views::filter. Interestingly, Pred is not required to be const here. So I passed the above mutable lambda to the range adaptor std::views::filter as follows:

for (int i : v | std::views::filter(cond)) {
    std::cout << "Value: " << i << std::endl;
}

This code compiles and gives the desired output:

Value: 8
Value: 2
Value: 5

Code on Wandbox

This leads me to my question: Why does std::views::take_while a const predicate, while std::views::filter does not?


Solution

  • Why this is a bad idea

    Let's produce a version that compiles and see what it actually does:

    struct MutablePredicate {
        mutable bool flag = true;
    
        auto operator()(int i) const -> bool {
            if (flag) {
                flag = (i != 5);
                return true;
            } else {
                return false;
            }
        }
    };
    
    std::vector<int> v = {8, 2, 5, 6};
    auto r = v | std::views::take_while(MutablePredicate{});
    
    fmt::print("First: {}\n", r);
    fmt::print("Second: {}\n", r);
    

    This prints {8, 2, 5} the first time, as desired. And then {} the second time. Because of course, we modified the predicate and so we get different behavior entirely. This completely breaks the semantics of this range (because your predicate fails to be equality-preserving), and all sorts of operations just totally fail as a result.

    The resulting take_view is a random-access range. But just think about what happens when you use iterators into it:

    std::vector<int> v = {8, 2, 5, 6};
    auto r = v | std::views::take_while(MutablePredicate{});
    
    auto it = r.begin();
    it += 2;                // this is the 5
    assert(it != r.end());  // does not fire, because we're not at the end
    assert(it == r.end());  // does not fire, because we're at the end??
    

    This is all sorts of weird and makes reasoning about this impossible.

    Why the difference in constraints

    The range adaptors in C++20 try to minimize the number of template instantiations by optimizing around "simple-view": V is a simple-view if both V and V const are ranges with the same iterator/sentinel types. For those cases, adaptors don't provide both begin() and begin() const... they just provide the latter (since there's no difference in these cases, and begin() const always works, so we just do it).

    Our case is a simple-view, because ref_view<vector<int>> only provides begin() const. Whether we iterate that type as const or not, we still get vector<int>::iterators out of it.

    As a result, take_while_view in order to support begin() const needs to require that Pred const is a unary predicate, not just Pred. Since Pred has to be equality-preserving anyway, it's simpler to just require that Pred const is a unary predicate rather than potentially supporting begin() /* non-const */ if only Pred but not Pred const is a unary predicate. That's just not an interesting case worth supporting.

    filter_view is not const-iterable, so it doesn't have to this consideration. It's only ever used as non-const, so there's no Pred const that it ever meaningfully has to consider as being a predicate.

    What you should do instead

    So if you don't actually need lazy evaluation, we can eagerly calculate the end iterator:

    auto e = std::ranges::find_if(v, [](int i){ return i == 5; });
    if (e != v.end()) {
        ++e;
    }
    auto r = std::ranges::subrange(v.begin(), e);
    // use r somehow
    

    But if you do need lazy evaluation, one way to do this is create your own adaptor. For a bidirectional+ range, we can define a sentinel such that we match the iterator if either (a) it's at the end of the underlying view's base or (b) it's not at the beginning of the range and the previous iterator matches the underlying view's end.

    Something like this (will only work on views that have a .base() since it only makes sense to and_one an adapted range):

    template <std::ranges::bidirectional_range V>
        requires std::ranges::view<V>
    class and_one_view {
        V base_ = V();
        using B = decltype(base_.base());
    
        class sentinel {
            friend and_one_view;
            V* parent_ = nullptr;
            std::ranges::sentinel_t<V> end_;
            std::ranges::sentinel_t<B> base_end_;
    
            sentinel(V* p)
                : parent_(p)
                , end_(std::ranges::end(*parent_))
                , base_end_(std::ranges::end(parent_->base()))
            { }
        public:
            sentinel() = default;
            auto operator==(std::ranges::iterator_t<V> it) const -> bool {
                return it == base_end_ ||
                    it != std::ranges::begin(*parent_) && std::ranges::prev(it) == end_;
            }
        };
    public:
        and_one_view() = default;
        and_one_view(V b) : base_(std::move(b)) { }
    
        auto begin() -> std::ranges::iterator_t<V> { return std::ranges::begin(base_); }
        auto end() -> sentinel { return sentinel(&base_); }
    };
    

    which for the purposes of demonstration we can make pipeable with libstdc++'s internals:

    struct AndOne : std::views::__adaptor::_RangeAdaptorClosure
    {
        template <std::ranges::viewable_range R>
            requires std::ranges::bidirectional_range<R>
        constexpr auto operator()(R&& r) const {
            return and_one_view<std::views::all_t<R>>(std::forward<R>(r));
        }
    };
    inline constexpr AndOne and_one;
    

    And now, because we adhere to all the semantic constraints of all the library components, we can just use the adapted range as a range:

    std::vector<int> v = {8, 2, 5, 6};
    auto r = v | std::views::take_while([](int i){ return i != 5; })
               | and_one;
    
    fmt::print("First: {}\n", r);   // prints {8, 2, 5}
    fmt::print("Second: {}\n", r);  // prints {8, 2, 5} as well
    

    Demo.