iterative binary search algorithm. I am wrote algorithm in 2 different ways. the change i made was high = len(data) and high = len(data) -1 . In both cases algorithm runs fine. But in most of the sites they show high = len(data) -1 is correct way. so is using -1 is better and why?
1st code)
def iterative_binary_search(data, target):
low = 0
high = len(data) # this line is where I need help
while low <= high:
mid = (low + high) // 2
if target == data[mid]:
return True
elif target < data[mid]:
high = mid - 1
else:
low = mid + 1
return False
2nd code)
def iterative_binary_search(data, target):
low = 0
high = len(data) -1 # this line is where I need help
while low <= high:
mid = (low + high) // 2
if target == data[mid]:
return True
elif target < data[mid]:
high = mid - 1
else:
low = mid + 1
return False
One of the codes doesn't run fine.
Calling ibs1
the first one, with high=len(data)
, and ibs2
the second one, with high = len(data)-1
, I get :
>>> haystack = [0,1,2,3,4,5,6,7,8,9]
>>> ibs2(haystack, 11)
False
>>> ibs1(haystack, 11)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 6, in ibs1
IndexError: list index out of range
len(data)
and len(data) - 1
You need to decide what low
and high
stand for, and make it very clear in your mind. When low=3
and high=6
, what does it mean? Does it mean we are searching between list indices 3 and 6 included? Or excluded? That's up to you to decide. If it's included, then you should use high = len(data) - 1
, because that is the index of the highest element of the array. If it's excluded, you should use high = len(data)
, because that is one past the index of the highest element in the array.
Both decisions are fine. But then this decision must reflect in the logic of the remaining of the code.
Hence, this code would be correct as well:
def ibs3(haystack, needle):
low = 0
high = len(haystack)
while low < high:
mid = (low + high) // 2
if needle == haystack[mid]:
return True
elif needle < haystack[mid]:
high = mid
else:
low = mid + 1
return False
Note that in python, the convention is most often to include low
and exclude high
. For instance, print(list(range(7, 10)))
prints [7, 8, 9]
: no number 10 in there!