Search code examples
c++sortingstdstd-ranges

Why std::sort actually moves the content of std::string?


What is the specific reason that both std::sort and std::ranges::sort lead to the underlying context of std::string copying during sorting, while std::swap just exchanges the pointers in the std::string?

See the demo with the addresses:

#include <algorithm>
#include <iostream>
#include <string>
#include <vector>

int main()
{
    std::vector<std::string> arr = { "You", "HereIJustWantToHaveMemoryReallocation", "See" };

    std::cout << "\nBefore sorting:";

    for (auto& s : arr) {
        std::cout << "\nString:" << s << " at " << std::addressof(s) << " points to " << (void*)(s.c_str());
    }

    std::sort(arr.begin(), arr.end());

    std::cout << "\nAfter sorting:";
    for (auto& s : arr) {
        std::cout << "\nString:" << s << " at " << std::addressof(s) << " points to " << (void*)(s.c_str());
    }
}

The question is not how to avoid this, which can easily be done, for example, with std::string_view. The question is, why do std::sort and std::ranges::sort behave this way?

I would expect the sort algorithm to use std::swap or some kind of move semantics which would allow std::string to just exchange pointers to the underlying strings, not touching the strings themselves. Why is it not the case?


Solution

  • With the long string it doesn't use small string optimization so the pointer to the string data remains unchanged. In this case, the swap actually just moves the ownership between the two std::strings.

    For the shorter strings, small string optimization is used and the data pointer is actually a pointer inside the std::string object itself. As such, the pointer is unique for each std::string and can't be moved to another std::string.