Search code examples
c++sortingc++20std-rangesprvalue

Understanding return type of std::ranges::sort with temporary and prvalue arguments


I have issues understanding when does a type deduce to std::ranges::dangling when using named local variables vs a prvalue as arguments to std::ranges::sort algorithm . For example I have function that returns a prvalue to a standard container, say std::vector which I use directly as an argument to std::ranges::sort, then I expect to get a compilation error regarding std::ranges::dangling when I attempt to dereference the iterator, which is what I get :

#include <vector>
#include <algorithm>

auto get_data(){
    return std::vector<int>{1, 2, 99, 5, 9, 4};
}

auto get_sorted(){
    return std::ranges::sort(get_data());
}


int main(){
    auto it = get_sorted();
    *(it-1); //Compiler Error because it is a std::ranges::dangling iterator
}

However if I change the get_sorted function above slightly to first capture the vector in a named variable and use that instead to return the result of std::ranges::sort, then I don't get a dangling iterator, in the main even though the allocated vector in named variable should have been destroyed after the function get_sorted returned :

auto get_sorted(){
    auto vec = get_data();
    return std::ranges::sort(vec);
}


int main(){
    auto it = get_sorted();
    *(it-1); //Okay result  = 99
}

Even if I change get_sorted to use a named variable based container declared locally, I get a behavior where the compiler doesn't complain about dangling iterator on being dereferenced in its caller (say main function) .

//doesn't return a std::ranges::dangling
auto get_sorted(){
    std::vector<int> vec{1, 2, 99, 5, 9, 4};
    return std::ranges::sort(vec);
}

and when I pass a prvalue vector to the std::ranges::sort algorithm, I again get a std::ranges::dangling as expected

//returns a std::ranges::dangling
auto get_sorted(){
    std::vector<int> vec{1, 2, 99, 5, 9, 4};
    return std::ranges::sort(std::vector<int>{1, 2, 99, 5, 9, 4});
}

For the cases where I don't get an error by compiler regarding std::ragnes::dangling I do observe a runtime error when I compile with fSanitize=address option which is likely because the vectors that were allocated in named variable within the get_sorted function went out of scope as soon as the latter function returned.

However I would like to understand why using named variable within a function changes the return type of get_sorted function and possible a guide on how to use temporaries and prvalues containers correctly to get std::ranges::dangling whenever possible.


Solution

  • Actually I think I'm pretty sure it's because the code is creating temporaries that go out-of-bounds. The example code on https://en.cppreference.com/w/cpp/ranges/dangling perfectly shows what you are doing, produces this behaviour.

    The code (return std::ranges::sort(get_data());) is triggering the same static assert as in the example: static_assert(std::is_same_v<std::ranges::dangling, decltype(dangling_iter)>);.

    This will yield a dangling type when the source goes out of bounds / will become inaccessible after the sort operation.

    What you did in:

    //doesn't return a std::ranges::dangling
    auto get_sorted(){
        std::vector<int> vec{1, 2, 99, 5, 9, 4};
        return std::ranges::sort(vec);
    }
    

    is trick this safety net into unsafe operations - The code using the returned iterator is trying to access freed memory (that's a big no-no). There's no good way to detect this in the standard library, and so std::ranges library is not going to protect you from it. In case of the dangling return type, it's possible to detect a temporary argument with templates. So std::ranges::sort doesn't even do any sorting probably, if implemented efficiently, it just short-circuits to returning the dangling type.

    A local variable will become unavailable after it's current scope, and this means that any references become dangling. The std::ranges library is doing it's best to try to protect you from doing this for simple cases (like your first one).

    To make the code work you'll have to keep the vector available during any iterator operations:

    auto get_data(){
        return std::vector<int>{1, 2, 99, 5, 9, 4};
    }
    
    auto get_sorted(std::vector<int>& data){
        return std::ranges::sort(data);
    }
    
    
    int main(){
        auto data = get_data(); // keep the vector in the same scope as the iterator operations
        auto it = get_sorted(data);
        *(it-1);
    }
    

    You can also do iterator operations in subscopes as long as you keep the vector alive in the parent scope(s).

    To illustrate it live: https://coliru.stacked-crooked.com/a/b8b2945cfda85a20

    As you can see, the vector gets destroyed before get_sorted even finishes.

    Remember that containers contain data, and algorithms only provide a view (with iterators) into that data, or they modify existing data, they don't make copies (unless explicitly specified like with std::copy).