Search code examples
c++stlstdstd-ranges

Is there a way to address more than one pair of elements in comparison predicate?


In the code below the values vector contains data indexed by indexes.

Background

It works just fine for sorting or searching while I have to deal with one value (actually, pair of them, lhs and rhs) in the comparator at a time. The problem comes when I want to use a sequence of array elements while sorting the indexes; they must be sorted not by the values, but by sequences of values.

Question

Is there an std::range-based solution which would allow me to address the sequences of the elements in the array in std::range::sort and in std::range::equal_range?

Code

Here is current implementation (not working with std::range::equal_range, but working with std::range::sort) with a ChainComparator which shows what exactly I want to have.

auto make_index_projection = []<std::ranges::random_access_range R>(R & range) {
    return [&range](std::ranges::range_difference_t<R> i) -> decltype(auto) {
        return std::ranges::begin(range)[i];
        };
};

class ChainComparator {
    const std::vector<int>& vec;
public: 
    ChainComparator(const auto& vec) : vec(vec) {}

    bool operator() (int& lhs, int& rhs) {
        // Dirty trick
        auto lhs_range_begin = vec.begin() + (&lhs - std::data(vec));
        auto rhs_range_begin = vec.begin() + (&rhs - std::data(vec));

        return std::lexicographical_compare(lhs_range_begin, vec.end(), rhs_range_begin, vec.end());
    }
};

int main()
{
    std::vector<int> values = { 0,30,20,40,10 };
    std::vector<int> indexes = { 3,4,2,1,0 };

    auto index_projection = make_index_projection(values);

    std::ranges::sort(indexes, ChainComparator(values), index_projection);

    auto indexing_view = indexes | std::views::transform(index_projection);

    std::vector<int> values_to_search = { 20, 40 };
    // Of course, doesn’t work since the dirty trick above requires all the arguments of comparison to be in the values array which is violated here
    auto range_pred_seq = std::ranges::equal_range(indexing_view, values_to_search, ChainComparator(values));
}

Solution

  • Of course what you are looking for is possible. After all, ranges are really powerful and flexible and cool :)

    make_index_projection right now returns a single lvalue from the range, but you can also return a subrange, which is a view from the specified element to anywhere you are interested. In your example code, it seems that what you are interested are the subrange from the given element to the end, so let's do the same. For better demo, I am using a std::string (a range of chars instead of a range of ints) because it is easier to print a string.

    The range version of the make_index_projection is just as simple as follows,

    auto make_index_projection_for_subrange = []<std::ranges::random_access_range R>(R & range) {
        return [&range](std::ranges::range_difference_t<R> i) -> decltype(auto) {
            return std::ranges::subrange(std::ranges::begin(range)+i, std::ranges::end(range));
            };
    };
    

    std::ranges::subrange is just the view you are looking for that keeps track of the underlying range with the custom indices you provide.

    Now, you have already known that, to use projections and views to sort a range, you can do the following

        const std::string s1 = "abdbc";
        std::vector i1{0,1,2,3,4};
        
        //Sort by single char
        auto proj1 = make_index_projection(s1);
        std::ranges::sort(i1, {}, proj1);
    

    Then, when you want to loop over the range indirectly via the indices, you make a view from the projection:

        auto view1 = i1 | std::views::transform(proj1);
    

    If you then construct a std::string from the view, and then print it, you get the sorted result "abbcd". Not surprising.

        std::cout << std::string(view1.begin(), view1.end()) << std::endl;
    

    (Hope now you can see why I chose a std::string to demo instead. Because it provides a very natural way to output the view, but this is just marginal).

    I assume all stated above are what you already know about. Now let's try to use the new make_index_projection_for_subrange function. First, let's reset our indices to {0,1,2,3,4} so that we can better demonstrate what the view does.

        auto proj2 = make_index_projection_for_subrange(s1);
        std::ranges::sort(i1);
        auto view2 = i1 | std::views::transform(proj2);
    

    Now let's examine the view,

        for(auto s: view2){
            std::cout << std::string(s.begin(), s.end()) << std::endl;
        }
    

    The output

    abdbc
    bdbc
    dbc
    bc
    c
    

    This shows that each element of the view is itself a range (actually, a subrange view) that contains the substring from the current char to the end of the string. This is exactly what you want to achieve in ChainedComparator, but in a much cleaner way.

    Now it should be obvious how to sort the indices. Of course you still have to provide a custom comparator because comparison between two ranges is not well defined.

        std::ranges::sort(i1, [](auto&&... args){return std::ranges::lexicographical_compare(args...);}, proj2);
    

    I digress a bit here. Originally, the code was just that

        std::ranges::sort(i1, std::ranges::lexicographical_compare, proj2);
    

    This works in GCC and Clang, but not in current version of MSVC as pointed out by OP in a comment. This is because std::ranges::lexicographical_compare is a niebloid and using a niebloid as a callable is not always well defined per this excellent answer. So far all major compiler vendors use objects to implement nieboids, but MSVC currently makes the object non-copyable1, so this simpler version does not work. It can be fixed by just adding a std::cref around it to avoid copying,

        std::ranges::sort(i1, std::cref(std::ranges::lexicographical_compare), proj2);
    

    However, this is still not strictly standard-conforming, because a niebloid does not have to be an object at all and can be implemented via compiler magic. The real standard conforming way is using a lambda like above. Note that I have not included perfect forwarding in the lambda because I know that no sane implementation of lexicographical_compare would attempt to copy the ranges, but in general perfect forwarding should be a concern, and the identity lambda should instead be

    []<typename... Ts>(Ts&&... args){
        return niebloid_to_be_forwarded(std::forward<Ts>(args)...);
    }
    

    Back to the main topic. Let's confirm our indices are indeed sorted as expected:

        view2 = i1 | std::views::transform(proj2);
        for(auto s: view2){
            std::cout << std::string(s.begin(), s.end()) << std::endl;
        }
    

    Output:

    abdbc
    bc
    bdbc
    c
    dbc
    

    Isn't that cool? As I said, ranges are cool. Now I see you are concerned about using equal_range to find a subsequence. That is of course no problem, either,

        auto eqn_range = std::ranges::equal_range(view2, std::string_view{"bdbc"}, [](auto&&... args){return std::ranges::lexicographical_compare(args...);});
    

    The elements of view2 are subrange views instead of std::string_views, but std::ranges::lexicographical_compare is completely capable of comparing them. This just again shows how cool ranges are. Note that you cannot use "bdbc" itself here because it is a const char[5] and the final null byte will ruin your day.

    Finally, we can confirm that the equal_range does give the expected result:

        std::cout << "eqn_range size: " << eqn_range.size() << std::endl;
        std::cout << "eqn_range begin: " << std::string_view((*eqn_range.begin()).begin(), (*eqn_range.begin()).end()) << std::endl;
        std::cout << "eqn_range end: " << std::string_view((*(eqn_range.end())).begin(), (*(eqn_range.end())).end()) << std::endl;
    

    The outputs are:

    eqn_range size: 1
    eqn_range begin: bdbc
    eqn_range end: c
    

    Which confirms that equal_range has successfully found the sequence.

    The complete demo can be found here: https://godbolt.org/z/cqrvanzPh

    Finally, note that s1 is const std::string, so all these sorting happen without modifying the value range. All that happens are index reordering.


    1 According to this bug report, MSVC has also decided to make the niebloid a CPO in the near future, so the simpler syntax will work for all three compilers. Thanks to @康桓瑋 for pointing out this in a comment.