First things first, let me tell you how I am actually using the algorithms, lest any confusion arises, since this is NOT the typical use of a Binary Search algorithm.
I needed the functions for purely academic / testing purposes to compare speeds against other functions I've tested, including Python's in-built isinstance() and in methods / operands.
I'm going through a string that contains digits and letters and testing whether a character from it is an integer.
Thus, the application of the following algorithms upon each iteration (i.e. character) of the loop upon the string is thus:
binSearch("0123456789", "4", 0, 10)
The "4" is just as an example as you can tell.
binSearch2("0123456789", "w")
The "w" is just as an example as you can tell.
NOTE: I am aware of the existence of the bisect
module. That's not the point and object of my experiment and exercise, though.
#RECURSIVE VERSION BELOW
def binSearch(list, digit, low, up):
mid = (low + up) // 2
midd = list[mid]
if digit == midd:
return True
elif digit > midd:
if mid == 9: return False
return True and binSearch(list, digit, mid, up)
elif digit < midd:
return True and binSearch(list, digit, low, mid)
#ITERATIVE VERSION BELOW
def binSearch2(list, digit):
low = 0
up = len(list)
mid = (low + up) // 2
while mid > 0:
if digit == list[mid]:
return True
elif digit > list[mid]:
low = mid
if low == 9: break
else:
up = mid
mid = (low + up) // 2
#print(low, mid, up)
if digit == list[mid]: return True
return False
Comments welcome !
Both of your functions can be improved by making sure that mid
is excluded from the range of indexes that you consider on the next pass (either a recursive call, or a subsequent loop). Since you just checked if the mid
index has the value you want (and it didn't), you can exclude mid
from the upper interval, by setting low
to mid+1
on the next pass. It's already being excluded from the lower intervals, since the up
index is beyond the end of the interval (it's half-open).
You've also hard-coded a strange base failure case into the function, where you check for mid==9
. That won't work properly for a whole lot of inputs (such as shorter strings), and may cause your code to run forever, or raise exceptions depending on where the needle character is relative to the characters in the haystack string (try searching for ' '
with your current code). The proper base case test is low >= up
, which indicates that your search interval is empty. For the iterative code, you'd keep looping as long as low < up
.
Here's how I'd update your functions:
def binSearch(list, digit, low, up):
if low >= up: # base case moved here, tested properly
return False
mid = (low + up) // 2
midd = list[mid]
if digit == midd:
return True
elif digit > midd: # no more base case code in this block
return binSearch(list, digit, mid+1, up) # exclude mid from the next interval
elif digit < midd:
return binSearch(list, digit, low, mid)
def binSearch2(list, digit):
low = 0
up = len(list)
while low < up: # base case test is here now
mid = (low + up) // 2 # move this here, so we don't need to repeat it
if digit == list[mid]:
return True
elif digit > list[mid]:
low = mid + 1 # skip mid when moving up, no longer test base case here
else:
up = mid
return False