I've got as assignment to implement the heap sort algorithm into either Python or Java (or any other languages). Since I'm not that really "fluent" in Python or Java I decided to do both.
But here I ran into a problem, the running time of the program is way too much hight than it "should" be. I mean by that, that the heap sort is supposed to run into a O(n * log n) and for current processor running on a clock rate of several GHz I didn't expect for that algorithm to run into over 2000secs for an array of size 320k
So for what I've done, I implemented the algorithm from the pseudo code of this sort in Python and in Java (I also tried the code in Julia from Rosetta Code to see if the running time was similar, why Julia ? Random pick)
So I checked the output for small input size problem, such as an array of size 10, 20 and 30. It appears that the array it correctly sorted in both languages/implementations.
Then I used the heapq library that implement this same algorithm to check once again if the running time was similar. It surprised me when it was actually the case... But after few tries I tried one last thing which is updating Python and then, the program using heapq ran much faster than the previous ones. Actually it was around 2k sec for the 320k array and now it around 1.5 sec or so.
I retried my algorithm and the problem was still there.
So here are the Heapsort class that I implemented:
class MaxHeap:
heap = []
def __init__(self, data=None):
if data is not None:
self.buildMaxHeap(data)
@classmethod
def toString(cls):
return str(cls.heap)
@classmethod
def add(cls, elem):
cls.heap.insert(len(cls.heap), elem)
cls.buildMaxHeap(cls.heap)
@classmethod
def remove(cls, elem):
try:
cls.heap.pop(cls.heap.index(elem))
except ValueError:
print("The value you tried to remove is not in the heap")
@classmethod
def maxHeapify(cls, heap, i):
left = 2 * i + 1
right = 2 * i + 2
largest = i
n = len(heap)
if left < n and heap[left] > heap[largest]:
largest = left
if right < n and heap[right] > heap[largest]:
largest = right
if largest != i:
heap[i], heap[largest] = heap[largest], heap[i]
cls.maxHeapify(heap, largest)
@classmethod
def buildMaxHeap(cls, heap):
for i in range(len(heap) // 2, -1, -1):
cls.maxHeapify(heap, i)
cls.heap = heap
@staticmethod
def heapSort(table):
heap = MaxHeap(table)
output = []
i = len(heap.heap) - 1
while i >= 0:
heap.heap[0], heap.heap[i] = heap.heap[i], heap.heap[0]
output = [heap.heap[i]] + output
heap.remove(heap.heap[i])
heap.maxHeapify(heap.heap, 0)
i -= 1
return output
To log the runtime for each array size (10000 - 320000) I use this loop in the main function :
i = 10000
while i <= 320000:
tab = [0] * i
j = 0
while j < i:
tab[j] = randint(0, i)
j += 1
start = time()
MaxHeap.heapSort(tab)
end = time()
pprint.pprint("Size of the array " + str(i))
pprint.pprint("Total execution time: " + str(end - start) + "s")
i *= 2
If you need the rest of the code to see where the error could be, don't hesitate I'll provide it. Just didn't want to share the whole file for no reasons.
As said earlier the running time I expected is from the worst case running time : O(n * log n) with modern architecture and a processor of 2.6GHz I would expect something around 1sec or even less (since the running time is asked in nanosecond I suppose that even 1 sec is still too long)
Here are the results :
Python (own) : Java (Own)
Time Size Time Size
593ms. 10k 243ms. 10k
2344ms. 20k 600ms. 20k
9558ms. 40k 1647ms. 40k
38999ms. 80k 6666ms. 80k
233811ms. 160k 62789ms. 160k
1724926ms. 320k 473177ms. 320k
Python (heapq) Julia (Rosetta Code)
Time Size Time Size
6ms. 10k 21ms. 10k
14ms. 20k 21ms. 20k
15ms. 40k 23ms. 40k
34ms. 80k 28ms. 80k
79ms. 160k 39ms. 160k
168ms. 320k 60ms. 320k
And according to the formula the O(n * log n) give me :
40000 10k
86021 20k
184082 40k
392247 80k
832659 160k
1761648 320k
I think that these result could be used to determine how much time it should take depending on the machine (theoretically)
As you can see the high running time result comes from my algorithm, but I can't tell where in the code and that's why I'm asking here for help. (Runs slow both in Java and Python) (Didn't try to use heap sort in java lib is there is one to see the difference with my implementation, my bad)
Thanks a lot.
Edit : I forgot to add that I run this program on a MacBook Pro (last version MacOS, i7 2,6GHz. In case the problem could also comes from anything else than the code.
Edit 2 : Here are the modifications I did on the algorithm, following the answer I received. The program run approximately 200 times faster than previously, and so now it run in barely 2sec for the array of size 320k
class MaxHeap:
def __init__(self, data=None):
self.heap = []
self.size = 0
if data is not None:
self.size = len(data)
self.buildMaxHeap(data)
def toString(self):
return str(self.heap)
def add(self, elem):
self.heap.insert(self.size, elem)
self.size += 1
self.buildMaxHeap(self.heap)
def remove(self, elem):
try:
self.heap.pop(self.heap.index(elem))
except ValueError:
print("The value you tried to remove is not in the heap")
def maxHeapify(self, heap, i):
left = 2 * i + 1
right = 2 * i + 2
largest = i
if left < self.size and heap[left] > heap[largest]:
largest = left
if right < self.size and heap[right] > heap[largest]:
largest = right
if largest != i:
heap[i], heap[largest] = heap[largest], heap[i]
self.maxHeapify(heap, largest)
def buildMaxHeap(self, heap):
for i in range(self.size // 2, -1, -1):
self.maxHeapify(heap, i)
self.heap = heap
@staticmethod
def heapSort(table):
heap = MaxHeap(table)
i = len(heap.heap) - 1
while i >= 0:
heap.heap[0], heap.heap[i] = heap.heap[i], heap.heap[0]
heap.size -= 1
heap.maxHeapify(heap.heap, 0)
i -= 1
return heap.heap
And it runs using the same main as given before
Its interesting that you posted the clock speed of your computer- you COULD calculate the actual number of steps your algorithm requires... but you would need to know an awful lot about the implementation. For example, in python every time an object is created or goes out of scope, the interpreter updates counters on the underlying object, and frees the memory if those ref counts reach 0. Instead, you should look at the relative speed.
The third party examples you posted show the speed as less then doubling when the input array length doubles. That doesn't seem right, does it? Turns out that for those examples the initial work of building the array probably dominates the time spent sorting the array!
In your code, there is already one comment that calls out what I was going to say...
heap.remove(heap.heap[i])
This operation will go through your list (starting at index 0) looking for a value that matches, and then deletes it. This is already bad (if it works as intended, you are doing 320k comparisons on that line if your code worked as you expected!). But it gets worse- deleting an object from an array is not an in-place modification- every object after the deleted object has to be moved forward in the list. Finally, there is no guarantee that you are actually removing the last object there... duplicate values could exist!
Here is a useful website that lists the complexity of various operations in python - https://wiki.python.org/moin/TimeComplexity. In order to implement an algorithm as efficiently as possible, you need as many of your data structure operations to be O(1) as possible. Here is an example... here is some original code, presumably with heap.heap being a list...
output = [heap.heap[i]] + output
heap.remove(heap.heap[i])
doing
output.append(heap.heap.pop())
Would avoid allocating a new list AND use a constant time operation to mutate the old one. (much better to just use the output backwards than use the O(n) time insert(0) method! you could use a dequeue object for output to get appendleft method if you really need the order)
If you posted your whole code there are probably lots of other little things we could help with. Hopefully this helped!