Search code examples
pythonfunctionsortingheapsort

Сounting the number of comparisons and permutations in heapsort


I'm trying to count the number of comparisons and swaps in heapsort algorithm (num_comps and num_swaps accordingly):

import numpy as np
import timeit


def heapify(arr, len_arr, i):
    num_swaps = 0
    num_comps = 0
    maxima = i
    l = 2 * i + 1
    r = 2 * i + 2
    num_comps += 1
    if l < len_arr and arr[i] < arr[l]:
        maxima = l
    num_comps += 1
    if r < len_arr and arr[maxima] < arr[r]:
        maxima = r
    if maxima != i:
        num_swaps += 1
        arr[i], arr[maxima] = arr[maxima], arr[i]
        heapify(arr, len_arr, maxima)
    return num_swaps, num_comps


def heap_sort(arr):
    num_swaps = 0
    num_comps = 0
    len_arr = len(arr)
    for i in range(len_arr, -1, -1):
        heapify(arr, len_arr, i)
    for i in range(len_arr - 1, 0, -1):
        num_swaps += 1
        arr[i], arr[0] = arr[0], arr[i]
        heapify(arr, i, 0)
    return num_swaps, num_comps


flag = input("If you wanna randomize entered data, press 'Enter'. To enter manually press any other key: ")

if flag == '':
    arr = np.random.randint(-10000, 10000, 1000)
else:
    while True:
        try:
            arr = []
            for i in range(10):
                a = int(input(f'Enter {i + 1} element: '))
                arr.append(a)
            break
        except ValueError:
            print("Input an integer!")
print(f'Non-sorted array: \n {arr}')

num_comps, num_swaps = heap_sort(arr)

len_arr = len(arr)
print(f'Sorted array: \n {arr}')
print(num_comps, 'comparisons were executed')
print(num_swaps, 'swaps were executed ')
t = timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
print(f"Program was executed by {t} seconds")

My code works but display wrong values. I know that I'm doing something wrong but I can't understand what exactly. I'm just learning python functions so please show me the code examples. I would be grateful for any help.

UPD: I modified my code according to @Luke19 answer, but I still have wrong values (1000 numbers to sort, 1000 comparisons and 2 swaps).


Solution

  • Your heapify function returns two objects, so you should be using the same syntax you used for heap_sort*: num_swaps, num_comps = heapify(...). If you don't do that, your num_comps variable will not be updated inside heap_sort.

    UPDATE:

    Note that using global variables is often not ideal. (You can find some discussion on that here and here, for example.) Usually if there is a simple way around it, you should avoid them in order to have a less bug-prone code.

    So, I'll provide you with a more detailed explanation of how to fix your code without using global variables:

    First, notice that num_swaps and num_comps need to be updated every time you call heapify, even when calling heapify inside itself! However, in your original code those two counters were being reset to zero every time heapify was called. Therefore, in order for them to keep their current values, all you need to do is include them as arguments to heapify, then use the returned values to update the original ones.

    Here's an example using your own code:

    #notice the change to the function's arguments
    def heapify(arr, len_arr, i, num_swaps, num_comps):
        maxima = i
    
        #etc...
    
        #then every time you call heapify, pass your current num_swaps and     
        # num_comps then update them by setting them to the returned values:
        if maxima != i:
            num_swaps += 1
            arr[i], arr[maxima] = arr[maxima], arr[i]
            num_swaps, num_comps = heapify(arr, len_arr, maxima, num_swaps, num_comps) #notice the change to the line
    
        return num_swaps, num_comps
    

    Because you're passing to heapify the current local values of num_swaps and num_comps, each time you do num_swaps, num_comps = heapify(...,num_swaps, num_comps) you will be updating your local values for those variables.

    You should do the same thing for the calls to heapify inside your heap_sort function:

    def heap_sort(arr):
        num_swaps = 0
        num_comps = 0
        len_arr = len(arr)
        for i in range(len_arr, -1, -1):
            num_swaps, num_comps = heapify(arr, len_arr, i, num_swaps, num_comps) #notice the change to this line
    
        #etc....
    
        return num_swaps, num_comps
    

    Hope this helps! Let me know if it's clear.