Search code examples
pythonarraysqueuepriority-queueheap

How to Initialize a Min Heap?


I'm trying to figure out how I can initialize a min heap using an array. So far my function looks like this:

    def start_heap(self,n):
        # None positions to be filled with Nodes
        arr = [none][] * n
        for i in arr:
            heapify(i) 

    def heapify(self):
        start = self.parent(len(self) - 1) # Start at parent of last leaf
        for j in range(start, -1, -1):      # going to and including the root.
            self.heapify_down(j)

 def heapify_down(self, i):
        n = len(self._pq)
        left, right = self.left(i), self.right(i)
        if left < n:
            child = left
            if right < n and self.pq[right] < self.pq[left]:
                child = right
            if self.pq[child] < self.pq[i]:
                self.swap(i, child)
                self.heapify_down(child)

Heapify Down Pseudocode:

Heapify-down(H,i): Let n = length(H) If 2i>n then
Terminate with H unchanged Else if 2i<n then
Let left=2i, and right=2i+1
Let j be the index that minimizes key[H[left]] and key[H[right]] Else if 2i=n then
Let j=2i Endif
If key[H[j]] < key[H[i]] then
swap the array entries H[i] and H[j] Heapify-down(H , j)
Endif

I'm going to build a simple node class that just holds data but I'm not sure how to actually get the start_heap function working. Keep in mind n is the maximum number of elements that can be stored.


Solution

  • Some remarks on the code you provided (not on the code you didn't provide):

    • arr is a local variable, so whatever you do with it, once the function returns, that arr will be out of scope... and lost. You need an attribute or else subclass list

    • It is not common practice to "allocate" the array and fill it with None. In Python lists are dynamic, so you don't need to reserve slots ahead of time. You just need an empty list to start with.

    • There is no need to call heapify when the heap is empty.

    • There is certainly no need to call heapify many times in a loop. All the logic for heapifying is already present in that method, so no need to call it on each index individually. But as stated in the previous point: no need to call it on an empty list -- there is nothing to move.

    So the correction is quite basic:

        def start_heap(self, max_size):
            self.max_size = max_size
            self._pq = []
    

    Then, in many of your other methods, you will have to work with self._pq and self.max_size.

    For instance, it could have a method that indicates whether the heap is full:

        def is_full(self):
            return len(self._pq) >= self.max_size
    

    If you have an add method, it would first check if there is still room:

       def add(self, node):
           if is_full(self):
               raise ValueError("Cannot add value to the heap: it is full")
           # ... rest of your code ...