Search code examples
c++setbinary-search-tree

Do sets with transparant comparators need to form an equivalence class?


Say I have a set with a comparator like this:

struct prefix_comparator {
    using is_transparent = void;
    struct prefix {
        std::string_view of;
    };
    bool operator()(std::string_view l, std::string_view r) const {
        return l < r;
    }
    bool operator()(std::string_view l, prefix r) const {
        // Only compare up to the size of the prefix
        auto result = l.compare(0, r.of.length(), r.of);
        return result < 0;
    }
    bool operator()(prefix l, std::string_view r) const {
        auto result = r.compare(0, l.of.length(), l.of);
        return result > 0;
    }
};

(Full example usage)

So prefix_comparator::prefix{ "XYZ" } compares equivalent to any string starting with "XYZ" with this comparator. Would it be UB to use this in a std::set?

It does not form equivalence classes with things of type prefix_comparator::prefix, since prefix objects cannot be compared with each other and stuff like equiv("XYZABC", prefix{ "XYZ" }) && equiv(prefix{ "XYZ" }, "XYZabc") but not equiv("XYZABC", "XYZabc"). But the set doesn't store prefix objects, so I don't know if this applies.

It seems to work fine in practice (with libstdc++ std::set): count() returns the correct value greater than 1, equal_range can return an iterator range larger than one element, etc. I'm unsure if find is usable since it only returns an iterator to a single element (possibly it only returns an arbitrary element?)


Solution

  • From the draft standard.

    [associative.reqmts.general]/24.2.7.1

    Each associative container is parameterized on Key and an ordering relation Compare that induces a strict weak ordering ([alg.sorting]) on elements of Key.

    there is no "top level" requirement that non-Key type elements are strict weak ordered.

    There are restrictions on methods and what the arguments are. Here is the common set used:

    (7.20) kl is a value such that a is partitioned ([alg.sorting]) with respect to c(x, kl), with x the key value of e and e in a;

    (7.21) ku is a value such that a is partitioned with respect to !c(ku, x), with x the key value of e and e in a;

    (7.22) ke is a value such that a is partitioned with respect to c(x, ke) and !c(ke, x), with c(x, ke) implying !c(ke, x) and with x the key value of e and e in a;

    (7.23) kx is a value such that

    (7.23.1) a is partitioned with respect to c(x, kx) and !c(kx, x), with c(x, kx) implying !c(kx, x) and with x the key value of e and e in a, and

    (7.23.2) kx is not convertible to either iterator or const_­iterator; and [...]

    these are used in concert with a_tran, which is the shortcut the standard uses for "an associative container when Compare has is_transparent", to describe the requirements of most (all?) of the extra methods you get when you have is_transparent turned on.

    So under this, the important part is that your non-key elements actually used actually partition your container with the actual keys stored in the container:

    a_tran.erase(kx)
    

    122 # Result: size_­type

    123 # Effects: Erases all elements in the container with key r such that !c(r, kx) && !c(kx, r) is true.

    what this means is you have to check each and every methods requiements for what it expects. Count, for example:

    a_tran.count(ke)
    

    150 # Result: size_­type

    151 # Returns: The number of elements with key r such that !c(r, ke) && !c(ke, r).

    152 # Complexity: log(a_tran.size())+a_tran.count(ke)

    So here the argument must be ke, so the specific value passed in has to be a "nice" partition of the current contents of the associative container.