Search code examples
pythonpython-3.xtimeminexecution

How can I make my code to find the minimum faster?


I have a question about how to find minimum in a list in my way:

I have a large list where every record is like this:

['22:00:19', '0026f88e557225333f01', '23', '37', '', '176.2', '0', '60', 'SOMETHING', 
 3.318717958567554e-05]

But the first and the last records in the list don't contain the last number:

For example:

['22:00:09', '0026f88e557225333f01', '23', '37', '', '176', '0', '60', 'SOMETHING']

I need to find the min of that last column 3.318717958567554e-05 and its index in every call of my function.

Here is my code:

def find_min(data, size, num):

    for index, i in enumerate(data):

        if index == 0 or index == data.__len__() -1: continue

        if index == 1:
            minimum = float(i[9])
            idx = index
            continue

        if float(i[9]) < minimum or float(i[9]) < num:
            minimum = float(i[9])
            idx = index

    return idx, minimum

num is a user-defined threshold which is used to calculate min. (Min should be less than that.) This code works properly and I find what I want but how can I make my code faster because I call this function thousand times and I work with a huge dataset and as a result the execution time is very much because of that slow function.


Solution

  • Get rid of all the if statements and slice the array to just the ones you care about.

    Convert i[9] to float just once.

    def find_min(data, size, num):
        idx = 1
        minimum = float(data[1][9])
        for index, i in enumerate(data[2:-1]):        
            f = float(i[9])
            if f < minimum or f < num:
                minimum = f
                idx = index + 2 # +2 because of the slicing
        
        return idx, minimum
    

    Or if the list is so large that making a slice of it is too expensive, just iterate over indexes:

    def find_min(data, size, num):
        idx = 1
        minimum = float(data[1][9])
        for index in range(2, len(data)-1):     
            f = float(data[index][9])
            if f < minimum or f < num:
                minimum = f
                idx = index
        
        return idx, minimum