Search code examples
pythonalgorithmmathoptimization

Least common multiple of natural numbers up to a limit, say, 10'000'000


I'm working on a small Python program for myself and I need an algorithm for fast multiplication of a huge array with prime powers (over 660 000 numbers, each is 7 digits). The result number is over 4 millions digits. Currently I'm using math.prod, which calculates it in ~10 minutes. But that's too slow, especially if I want to increase amount of numbers.

I checked some algorithms for faster multiplications, for example the Schönhage–Strassen algorithm and Toom–Cook multiplication, but I didn't understand how they work or how to implement them. I tried some versions that I've found on the internet, but they're not working too well and are even slower. I wonder if someone knows how to multiply these amounts of numbers faster, or could explain how to use some math to do this?


Solution

  • There are two keys to making this fast. First, using the fastest mult implementation you can get. For "sufficiently large" multiplicands, Python's Karatsuba mult is O(n^1.585). The decimal module's much fancier NTT mult is more like O(n log n). But fastest of all is to install the gmpy2 extension package, which wraps GNU's GMP library, whose chief goal is peak speed. That has essentially the same asymptotics as decimal mult, but with a smaller constant factor.

    Second, the advanced mult algorithms work best when multiplying two large ints of about the same size (number of bits). You can leave that to luck, or, as below, you can force it by using a priority queue and, at each step, multiplying the "two smallest" partial products remaining.

    from gmpy2 import mpz
    from heapq import heapreplace, heappop, heapify
    
    # Assuming your input ints are in `xs`.
    mpzs = list(map(mpz, xs))
    heapify(mpzs)
    for _ in range(len(mpzs) - 1):
        heapreplace(mpzs, heappop(mpzs) * mpzs[0])
    assert len(mpzs) == 1
    # the result is mpzs[0]
    

    That's the code I'd use. Note that the cost of recursion (which this doesn't use) is trivial compared to the cost of huge-int arithmetic. Heap operations are more expensive than recursion, but still relatively cheap, and can waaaaay more than repay their cost if the input is in an order such that the "by luck" methods aren't lucky enough.