Search code examples
pythonlistmemorycompression

Compressing a list of integers in Python


I have a list of positive (random) integers with the following properties:

Number of elements: 78495

Maximum value of element: 999982

Length of list when converted to a string: 517115 (string looks like "6,79384,238956,...")

Size of list in text file on disk: 520 kb

I am trying to use this list as a precomputed list for an online judge problem because it takes a long time to actually generate this list. However, it is too large to be accepted if I paste it directly into the source code, which has a cap of 50 kb.

I looked into zlib as a way to compress the string but it only seemed to cut the size in half.

Is there a way to really shrink this down so I can unpack it / use it in the source code?


Solution

  • Given your definition ...

    it is a list of smallest-k values for which 10^k = 1 mod p for primes p > 5

    ... am I wrong to believe that your values are of the form (p - 1) / x where x is an integer significantly smaller than p?

    For instance, for p < 50, we have:

    p = 7  : 10^6  = 1 (mod 7)  => k = 6  = (p - 1) / 1  => x = 1
    p = 11 : 10^2  = 1 (mod 11) => k = 2  = (p - 1) / 5  => x = 5
    p = 13 : 10^6  = 1 (mod 13) => k = 6  = (p - 1) / 2  => x = 2
    p = 17 : 10^16 = 1 (mod 17) => k = 16 = (p - 1) / 1  => x = 1
    p = 19 : 10^18 = 1 (mod 19) => k = 18 = (p - 1) / 1  => x = 1
    p = 23 : 10^22 = 1 (mod 23) => k = 22 = (p - 1) / 1  => x = 1
    p = 29 : 10^28 = 1 (mod 29) => k = 28 = (p - 1) / 1  => x = 1
    p = 31 : 10^15 = 1 (mod 31) => k = 15 = (p - 1) / 2  => x = 2
    p = 37 : 10^3  = 1 (mod 37) => k = 3  = (p - 1) / 12 => x = 12
    p = 41 : 10^5  = 1 (mod 41) => k = 5  = (p - 1) / 8  => x = 8
    p = 43 : 10^21 = 1 (mod 43) => k = 21 = (p - 1) / 2  => x = 2
    p = 47 : 10^46 = 1 (mod 47) => k = 46 = (p - 1) / 1  => x = 1
    

    The list of x values should compress much better than the list of k values. (For instance, I'd be willing to bet that the most frequent value of x will be '1'.)

    And because it's rather easy and fast to compute primes up to 1 million (which I think is your upper bound), you may be able to quickly rebuild the list of k values based on the compressed list of x values and the real-time computed list of primes.

    You probably should have explained from the beginning what exactly you were trying to compress to get more accurate answers.