I am going through a data structures and algorithms book and in the section about interpolation search of arrays, the book includes this piece of code:
def interpolation_search(ar:list, lo:int, hi:int, x:int):
if (lo <= hi and x >= ar[lo] and x <= ar[hi]):
pos = lo + ((hi - lo) // (ar[hi] - ar[lo]) * (x - ar[lo]))
print(pos)
if ar[pos] == x:
return pos
if ar[pos] < x:
return interpolation_search(ar, pos + 1, hi, x)
if ar[pos] > x:
return interpolation_search(ar, lo, pos - 1, x)
return -1
if __name__ == '__main__':
v = 18
ar = [10, 12, 13, 16, 18, 19, 20, 21, 22, 23, 24, 33, 35, 42, 47]
assert(interpolation_search(ar, 0, len(ar)-1, v) == 4)
Now the question I have is, if I run that, it prints 0, 1, 2, 3, 4 (from print(pos)) This seems like a more complicated linear search.
So what am I missing or is this code wrong?
I expect this interpolation search to cut up the array into two based on the logic of the formula, but the it seems the output of the formula is always i, i+1, i+2, ...
You're not calculating pos
correctly. You're using integer division too early in the expression -- you have to do the multiplication first, then divide. If you do the division first, it gets rounded down before multiplying and you lose precision.
It also caused a division by zero error when searching for the last element in the array.
def interpolation_search(ar:list, lo:int, hi:int, x:int):
if (lo <= hi and x >= ar[lo] and x <= ar[hi]):
pos = lo + ((hi - lo) * (x - ar[lo])) // (ar[hi] - ar[lo])
print(pos)
if ar[pos] == x:
return pos
if ar[pos] < x:
return interpolation_search(ar, pos + 1, hi, x)
if ar[pos] > x:
return interpolation_search(ar, lo, pos - 1, x)
return -1