Search code examples
pythonalgorithmdata-structuresheappriority-queue

What would you use the heapq Python module for in real life?


After reading Guido's Sorting a million 32-bit integers in 2MB of RAM using Python, I discovered the heapq module, but the concept is pretty abstract to me.

One reason is that I don't understand the concept of a heap completely, but I do understand how Guido used it.

Now, beside his kinda crazy example, what would you use the heapq module for?

Must it always be related to sorting or minimum value? Is it only something you use because it's faster than other approaches? Or can you do really elegant things that you can't do without?


Solution

  • The heapq module is commonly use to implement priority queues.

    You see priority queues in event schedulers that are constantly adding new events and need to use a heap to efficiently locate the next scheduled event. Some examples include:

    The heapq docs include priority queue implementation notes which address the common use cases.

    In addition, heaps are great for implementing partial sorts. For example, heapq.nsmallest and heapq.nlargest can be much more memory efficient and do many fewer comparisons than a full sort followed by a slice:

    >>> from heapq import nlargest
    >>> from random import random
    >>> nlargest(5, (random() for i in xrange(1000000)))
    [0.9999995650034837, 0.9999985756262746, 0.9999971934450994, 0.9999960394998497, 0.9999949126363714]