Search code examples
c++dictionarystandard-library

Rationale for C++'s std map insert semantics?


I'm a little bit confused by std::map::insert's semantics. I mean, I'm not complaining - the standard is the standard and the API is the way it is. Still,

insert will

the insertion operation checks for each element inserted whether another element exists already in the container with the same key value, if so, the element is not inserted and its mapped value is not changed in any way.

And - only in its single-argument version pair<iterator,bool> insert ( const value_type& x ); will it even tell you if it even inserted the (new, possibly different) value to the key(s). As far as I understand, the iterator versions will silently ignore insertions if the key already exists.

For me, this is simply counter intuitive, I would have expected the value part to be overwritten and the old value part to be discarded on insert. Obviously, the designers of the STL thought differently -- anyone knows the (historical) rationale or can give a thorough explanation of how the existing semantics make (more) sense?

By example:

There are a few basic ways to implement insert in a single-key map such as std::map:

  • insert, replace if already exists
  • insert, ignore if already exists (this is the behavior of std::map)
  • insert, throw error if already exists
  • insert, UB if already exists

I'm now trying to understand why insert_or_ignore makes more sense than insert_or_replace (or insert_or_error)!


I looked into my copy of TC++PL (unfortunately I only have the German edition), and interestingly, Stroustrup writes in chapter 17.4.1.7 (list operations for map): (sorry rough translation from German)

(...) Normally, one doesn't care whether a key(sic!) is newly inserted or already existed before the call to insert() (...)

Which, it seems to me, would only hold true for set, and not for map, because for a map, it does make quite some difference if the provided value was inserted or the old one remains in the map. (It obviously doesn't matter for the key, as that one is equivalent.)


Note: I know about operator[] and I know about Item 24 of Effective STL and the there proposed efficientAddOrUpdate function. I'm just curious for a rationale into insert's semantics because I personally find them counter intuitive.


Solution

  • I don't know about an official rationale but I would note the duality with operator[].

    It seems obvious that one would like the two flavors of insert:

    • purely additive
    • additive / destructive

    If we see a map as a sparse representation of an array, then the presence of operator[] make sense. I do not know whether pre-existing dictionaries existed and dictated this syntax (maybe, why not).

    Also, all STL containers have several overloads of insert, and this similarity of interfaces is what allow Generic Programming.

    Therefore, we have at least two contenders for the API: operator[] and insert.

    Now, in C++, if you read:

    array[4] = 5;
    

    it is naturaly that the content of the cell at index 4 has been destructively updated. As such, it is natural that map::operator[] should return a reference to allow this destructive update.

    At this point, we now need a purely additive version as well, and we have this insert method lying around. Why not ?

    Of course one could have given insert the same semantics as operator[] and then go ahead and implement a insert_or_ignore method on top. This would have been more work though.

    Therefore, while I agree that it might be surprising, I think my reasoning is not too flawed and may be a likely explanation of the circumstances that lead us here :)


    Regarding the alternatives you proposed:

    • insert, UB if already exists

    Fortunately, it is not!

    • insert, throw error if already exists

    Only Java (and derivatives) is exception-crazy. C++ was conceived in a time where exceptions were used for exceptional circumstances.

    • insert, replace if already exists
    • insert, ignore if already exists (this is the behavior of std::map)

    We agree that the choice was between one of those. Note that even though map elected the second option, it does not completely ignore the fact that the item already existed, at least in the single item version since it warns you that the item was not inserted.