I am relatively new to programming and am not looking to be particularly efficient with this algorithm regarding running time but only trying to replicate the Karatsuba algorithm and make it work.
I have tried it with many numbers and small numbers (like y = 40004009343254, x = 40004001343234) work fine and when the numbers increase in size (like y = 4000400934325423423, x = 4000400134323432423), the algorithm stops working correctly and returns similar but incorrect answers.
Any clue about what could be wrong would be very much appreciated!
NOTE: This thread is not about efficiency but about getting the correct result. That said, comments about efficiency will be taken into account and appreciated too.
CODE:
y = 4000400934325423423
x = 4000400134323432423
def firsthalf(array):
firsthalf = array[:len(array)/2]
return firsthalf
def secondhalf(array):
secondhalf = array[len(array)/2:]
return secondhalf
def arrayjoint(array):
jointarray = long(''.join(map(str,array)))
return jointarray
def karatsuba(x,y):
if len(str(x)) == 0 or len(str(y)) == 0:
return "Can't multiply by a NULL value!"
if x < 10 or y < 10:
return x * y
x_array = [long(i) for i in str(x)]
y_array = [long(i) for i in str(y)]
firsthalf_xarray = firsthalf(x_array)
secondhalf_xarray = secondhalf(x_array)
firsthalf_yarray = firsthalf(y_array)
secondhalf_yarray = secondhalf(y_array)
half_size = max(len(secondhalf_yarray), len(secondhalf_xarray))
firsthalf_x = arrayjoint(firsthalf_xarray)
secondhalf_x = arrayjoint(secondhalf_xarray)
firsthalf_y = arrayjoint(firsthalf_yarray)
secondhalf_y = arrayjoint(secondhalf_yarray)
sum_x = firsthalf_x + secondhalf_x
sum_y = firsthalf_y + secondhalf_y
first = karatsuba(firsthalf_x,firsthalf_y)
second = karatsuba(sum_x, sum_y)
third = karatsuba(secondhalf_x,secondhalf_y)
return first * 10 ** (2 * half_size) + ((second - first - third) * (10 ** half_size)) + third
result = karatsuba(x,y)
result_correct = x*y
result = str(result)
result_correct = str(result_correct)
file = open("result.txt", "w")
file.write(str(result) + "\n" + str(result_correct))
file.close
It's not an issue with floats, because Python has bignums.
The issue is that, when the inputs have disparate lengths, you split them in different places, which defeats the algebra underlying Karatsuba's algorithm. By splitting at index -half_size
(i.e., the second half has half_size
digits), we ensure that 10**half_size
is the proper base. Try this:
def digits_to_long(x_array):
return long(''.join(x_array)) if x_array else 0L
def karatsuba(x, y):
if x < 10 or y < 10:
return x * y
x_array = str(x)
y_array = str(y)
half_size = max(len(x_array), len(y_array)) // 2
firsthalf_x = digits_to_long(x_array[:-half_size])
secondhalf_x = digits_to_long(x_array[-half_size:])
firsthalf_y = digits_to_long(y_array[:-half_size])
secondhalf_y = digits_to_long(y_array[-half_size:])
sum_x = firsthalf_x + secondhalf_x
sum_y = firsthalf_y + secondhalf_y
first = karatsuba(firsthalf_x, firsthalf_y)
second = karatsuba(sum_x, sum_y)
third = karatsuba(secondhalf_x, secondhalf_y)
return first * 10**(2 * half_size) + (
(second - first - third) * (10**half_size)) + third
import random
for i in range(10000):
x = random.randrange(10**18)
y = random.randrange(10**18)
assert karatsuba(x, y) == x * y