Search code examples
pythonalgorithmkaratsuba

Karatsuba algorithm working for small numbers but not for big ones, can't see why


I am relatively new to programming and am not looking to be particularly efficient with this algorithm regarding running time but only trying to replicate the Karatsuba algorithm and make it work.

I have tried it with many numbers and small numbers (like y = 40004009343254, x = 40004001343234) work fine and when the numbers increase in size (like y = 4000400934325423423, x = 4000400134323432423), the algorithm stops working correctly and returns similar but incorrect answers.

Any clue about what could be wrong would be very much appreciated!

NOTE: This thread is not about efficiency but about getting the correct result. That said, comments about efficiency will be taken into account and appreciated too.

CODE:

y = 4000400934325423423
x = 4000400134323432423

def firsthalf(array):
    firsthalf = array[:len(array)/2]
    return firsthalf
def secondhalf(array):
    secondhalf = array[len(array)/2:]
    return secondhalf
def arrayjoint(array):
    jointarray = long(''.join(map(str,array)))
    return jointarray
def karatsuba(x,y):
    if len(str(x)) == 0 or len(str(y)) == 0:
        return "Can't multiply by a NULL value!"
    if x < 10 or y < 10:
        return x * y
    x_array = [long(i) for i in str(x)]
    y_array = [long(i) for i in str(y)]
    firsthalf_xarray = firsthalf(x_array)
    secondhalf_xarray = secondhalf(x_array)
    firsthalf_yarray = firsthalf(y_array)
    secondhalf_yarray = secondhalf(y_array)
    half_size = max(len(secondhalf_yarray), len(secondhalf_xarray))
    firsthalf_x = arrayjoint(firsthalf_xarray)
    secondhalf_x = arrayjoint(secondhalf_xarray)
    firsthalf_y = arrayjoint(firsthalf_yarray)
    secondhalf_y = arrayjoint(secondhalf_yarray)
    sum_x = firsthalf_x + secondhalf_x
    sum_y = firsthalf_y + secondhalf_y
    first = karatsuba(firsthalf_x,firsthalf_y)
    second = karatsuba(sum_x, sum_y)
    third = karatsuba(secondhalf_x,secondhalf_y)
    return first * 10 ** (2 * half_size) + ((second - first - third) * (10 ** half_size)) + third

result = karatsuba(x,y)
result_correct = x*y
result = str(result)
result_correct = str(result_correct)
file = open("result.txt", "w")
file.write(str(result)  + "\n" + str(result_correct))
file.close

Solution

  • It's not an issue with floats, because Python has bignums.

    The issue is that, when the inputs have disparate lengths, you split them in different places, which defeats the algebra underlying Karatsuba's algorithm. By splitting at index -half_size (i.e., the second half has half_size digits), we ensure that 10**half_size is the proper base. Try this:

    def digits_to_long(x_array):
        return long(''.join(x_array)) if x_array else 0L
    
    
    def karatsuba(x, y):
        if x < 10 or y < 10:
            return x * y
        x_array = str(x)
        y_array = str(y)
        half_size = max(len(x_array), len(y_array)) // 2
        firsthalf_x = digits_to_long(x_array[:-half_size])
        secondhalf_x = digits_to_long(x_array[-half_size:])
        firsthalf_y = digits_to_long(y_array[:-half_size])
        secondhalf_y = digits_to_long(y_array[-half_size:])
        sum_x = firsthalf_x + secondhalf_x
        sum_y = firsthalf_y + secondhalf_y
        first = karatsuba(firsthalf_x, firsthalf_y)
        second = karatsuba(sum_x, sum_y)
        third = karatsuba(secondhalf_x, secondhalf_y)
        return first * 10**(2 * half_size) + (
            (second - first - third) * (10**half_size)) + third
    
    
    import random
    for i in range(10000):
        x = random.randrange(10**18)
        y = random.randrange(10**18)
        assert karatsuba(x, y) == x * y