Search code examples
c++sortingpriority-queue

Why do both insertion and extraction into/from a std::priority_queue take logarithmic time?


A [std::priority_queue] is a container adaptor that provides constant time lookup of the largest (by default) element, at the expense of logarithmic insertion and extraction.

Why is that? I think the sorting either happens on insertion or extraction. For example, if the sorting happens on insertion and the internal container remains sorted, wouldn't the extraction be able to happen in constant time? The top element to be removed is know and so is the next smaller one.

However, both std::priority_queue::push and std::priority_queue::pop mention in their complexity descriptions:

Complexity

Logarithmic number of comparisons

Why would both have to perform comparisons? With an internal container that stays sorted, extraction should be easy or vice versa, with sorting upon extraction, insertion should be easy.

I guess my assumption about when and how the sorting happens (or if there's any sorting happening at all) is just wrong. Could somebody please shed some light on this?


Solution

  • For example, if the sorting happens on insertion and the internal container remains sorted, wouldn't the extraction be able to happen in constant time?

    Extract could happen in constant time, but insertion would become O(n). You'd have to search for the place in the list to insert the new element and then shift all the other elements. O(1) extraction and O(n) insertion might be good for some use-cases, but not the problem that priority_queue is trying to solve.

    If sorting, on the other hand, happened on extraction, then you'd have O(n lg n) extraction and O(1) insertion. Which, again, is good for some use-cases, but that's not what priority_queue does.


    Rather than sorting elements, std::priority_queue stores its elements in a heap, which by construction has O(lg n) insertion and extraction. The structure is a tree, and insertion/extraction simply maintain the tree invariant. For some problems (like, say, search), in cases where we need to insert and extract many nodes, having O(lg n) for both operations is far superior than O(n)/O(1).

    As an example, and stealing images from Wikipedia, inserting the element 15 into the heap would initially place it at position x:

    init

    then swap it with the 8 (because the sorted invariant is broken):

    2nd step

    then finally swap it with the 11:

    final step

    In array form, the initial heap would be stored as:

    [11, 5, 8, 3, 4]
    

    and we would end up at:

    [15, 5, 11, 3, 4, 8]
    

    Extraction is just the reverse operation - bubbling down instead of bubbling up. As you see, there's no explicit "sorting" going on. We're not even touching most of the elements most of the time.


    std::priority_queue is a container adapter, but the container you provide should be a random access container with O(1) complexities for indexing, push_back, pop_back, front, back, etc. So the choice of container (unless you make a bad one) does not affect the overall complexity of priority_queue's operations.