Search code examples
pythonperformancerecursioniterationbinary-search

Wrote Binary Search algorithms for Python, both a recursive and iterative version. How would you judge their efficiency?


First things first, let me tell you how I am actually using the algorithms, lest any confusion arises, since this is NOT the typical use of a Binary Search algorithm.

I needed the functions for purely academic / testing purposes to compare speeds against other functions I've tested, including Python's in-built isinstance() and in methods / operands.

I'm going through a string that contains digits and letters and testing whether a character from it is an integer.

Thus, the application of the following algorithms upon each iteration (i.e. character) of the loop upon the string is thus:

binSearch("0123456789", "4", 0, 10)

The "4" is just as an example as you can tell.

binSearch2("0123456789", "w")

The "w" is just as an example as you can tell.

NOTE: I am aware of the existence of the bisect module. That's not the point and object of my experiment and exercise, though.


#RECURSIVE VERSION BELOW

def binSearch(list, digit, low, up):

mid = (low + up) // 2
midd = list[mid]

if digit == midd:
    return True
elif digit > midd:
    if mid == 9: return False
    return True and binSearch(list, digit, mid, up)
elif digit < midd:
    return True and binSearch(list, digit, low, mid)

#ITERATIVE VERSION BELOW

def binSearch2(list, digit):

low = 0
up = len(list)
mid = (low + up) // 2

while mid > 0:

    if digit == list[mid]: 
        return True

    elif digit > list[mid]:
        low = mid
        if low == 9: break

    else:
        up = mid

    mid = (low + up) // 2
    #print(low, mid, up)

if digit == list[mid]: return True

return False        

Comments welcome !


Solution

  • Both of your functions can be improved by making sure that mid is excluded from the range of indexes that you consider on the next pass (either a recursive call, or a subsequent loop). Since you just checked if the mid index has the value you want (and it didn't), you can exclude mid from the upper interval, by setting low to mid+1 on the next pass. It's already being excluded from the lower intervals, since the up index is beyond the end of the interval (it's half-open).

    You've also hard-coded a strange base failure case into the function, where you check for mid==9. That won't work properly for a whole lot of inputs (such as shorter strings), and may cause your code to run forever, or raise exceptions depending on where the needle character is relative to the characters in the haystack string (try searching for ' ' with your current code). The proper base case test is low >= up, which indicates that your search interval is empty. For the iterative code, you'd keep looping as long as low < up.

    Here's how I'd update your functions:

    def binSearch(list, digit, low, up):
        if low >= up:        # base case moved here, tested properly
            return False
        mid = (low + up) // 2
        midd = list[mid]
        if digit == midd:
            return True
        elif digit > midd:   # no more base case code in this block
            return binSearch(list, digit, mid+1, up)   # exclude mid from the next interval
        elif digit < midd:
            return binSearch(list, digit, low, mid)
    
    
    def binSearch2(list, digit):
        low = 0
        up = len(list)
        while low < up:  # base case test is here now
            mid = (low + up) // 2  # move this here, so we don't need to repeat it
            if digit == list[mid]: 
                return True
            elif digit > list[mid]:
                low = mid + 1      # skip mid when moving up, no longer test base case here
            else:
                up = mid
        return False