Search code examples
pythonlargenumbercollatz

Does Python have trouble with large numbers/lists, or is there something wrong with my code?


I have the following code:

def main(n):
    sequence = []
    while n not in sequence:
        sequence.append(n)

        if n % 2 == 0:
            i = n / 2
        else:
            i = (n * 3) + 1
        n = int(i)

    print("Sequence length: " + str(len(sequence)-1))

n = int(input("Number: "))
main(n)

It takes an integer from the user, then calculates the '3x+1' (or 'Collatz') sequence for that number (see http://www.ericr.nl/wondrous/index.html#part1 for details). It then displays the length of the sequence.

To make sure my code works as expected, I'm comparing my results against the table on http://www.ericr.nl/wondrous/delrecs.html The 'Delay' column appears to be the sequence length, and 'N' is the initial integer that's input. My code gives the correct length for all 'N' values (that I've random tested) up to 1,899148,184679 (#92). However for the next value of 'N' (2,081751,768559, #93) it gives a length of '385' (instead of '1437')... The same is true for all values on 'N' (that I've tested) greater than that on the table - my code gives a much smaller answer than what they do.

Does Python have issues with large numbers that might cause this behaviour, or is there something wrong with my code?

I admit that I don't fully understand 'delay records', or indeed much of the information on that webpage. But even if I'm wrong in my assumption that a 'delay record' is just the length of the sequence generated by this method, it seems odd that my code would calculate the same values up until that specific number...

I'm using Python 3.8.10 in case that's relevant.


Solution

  • Somewhere along the way you are getting a floating point error. "But I am using integers!" I hear you say, well Python is doing a float division on this line:

    i = n / 2
    

    Seems innocuous enough, but changing to integer division solves the problem:

    i = n // 2
    

    After several hundred values, one of those divisions is giving you an error with the value being some epsilon less than the actual integer value, which then rounds down when you call int(n).

    EDIT: After double checking my last point to find the value that fails, I wasn't quite correct. What is actually happening is that while integer division is always accurate thanks to Pythons BigInt implementation, float division isn't since it still uses regular floating point numbers for speed. If your number is large enough then there simply aren't enough bytes to accurately store the number which will result in rounding errors.

    The number in question is 19981441939834942. With integer division you get 9990720969917471, while float division yields 9990720969917472.0.

    This will be a problem in any language you use (except that most other languages won't allow you to accidentally use float division on integers), so make sure to use integer divisions!