Search code examples
c++vectorsetcontainersmultiset

what is the better feature do we have in multiset that is not in vector?


why we should use "multi set" (as we know "multi set" can keep repetitive key)

why we shouldn't use vector instead ?

is there any good feature in "multi set" that we don't have in vector ? (or other containers like vector)

Do you know any special usage for "multi set" ?


Solution

  • (moving/expanding from the comments)

    what is the better feature do we have in multiset that is not in vector?

    The key feature is fast (O(log N)) lookup "equivalent" (according to the comparer) elements, plus implicit sorting according to the comparer (with O(log N) cost in insertion)).

    (this plus the usual set features, such as O(1) removal (after O(log N) search) and references that are never invalidated)


    multiset is quite a "niche" container; in 10+ years of writing C++ I don't think I ever used it (or seeing it being used, FWIW). Most probably, its very existence owes much to the map/set <=> multimap/multiset symmetry.

    set is generally thought as "the container which does not allow duplicate elements", so having a set that allows duplicates apparently makes no sense at all. However, the actual point of set is to allow fast (O(log n)) lookup of objects according to some criterion, which crucially may not consider the complete content of the object.

    Let's make a step back; it's helpful here to understand that ?map is just a ?set in disguise1. In particular, you can think of a std::?map<K,V> as a std::?set<pair<const K,V>,KeyComp>, where KeyComp is a comparer that only considers the first (=key) part of the pair2 - which incidentally is exactly what std::?map::value_compare is; you'll notice also that the type of std::?map::value_type is indeed std::pair<const K, V>.

    So, ?map is essentially a convenience over ?set for the common case where you want to index values by some key that is not already "inside the value itself".

    If instead the key is already stored inside the value - which is generally what happens when we are indexing existing data by some of its properties - a ?set with a custom comparer can be used, thus avoiding the key replication of the key that would be required in a ?map.

    Let's consider an imaginary book database; you have a big std::deque<std::unique_ptr<Book>> that contains all the book objects, and you want to index them by author and by title for quick lookup.

    In this case, you can use a multiset<Book *, CustomComp> for each field you want to index; the custom comparer will implement a < operator that considers just a specific field of the pointed element.

    When a new book is added you just need to add it to the deque and to all the indexes; when a book is removed you have to remove it from the deque and from the indexes. An edit requires to first remove it from the indexes, apply the changes and then re-add it (straight modification can lead to an inconsistent state of the multiset instances, because you may be changing the ordering relation between the stored objects "behind their back").

    It's perfectly fine here that you have a set which allows duplicate elements: they are duplicate only according to the comparer, which just considers one field; it's perfectly normal to have multiple books by the same author, or different books with the same title. The point here is not "not allowing duplicates", it's "fast lookup by some key", exactly as if we were using a multimap, but without having to keep an extra copy of the key inside the indexes.


    Notes

    1. When I write ?set or ?map I'm talking about both the "regular" and "multi" variants. It's worth noting that the core of the discussion applies mostly in the same way even to the unordered counterparts, changing the comparer with the hash function.
    2. Of course, when doing a lookup you have to provide a dummy second argument, and that's why using a map is more convenient.