I have a set of 100 integers from 0-99 in random order. Each number in the range occurs exactly once. I am looking for the most effective lossless compression algorithm to compress this data while maintaining the ability to decompress it efficiently?
I know that compressing random numbers is theoretically not possible, but I'm wondering if it's possible for this case since each number only appears once and you basically just need to compress the order of the numbers somehow... Please don't be harsh, I'm not an expert in this field. Also, any tips for python implementations would be greatly appreciated! :)
Thanks in advance.
Using the permutation index as the compression result, usually compresses to 66 bytes and takes less than a millisecond:
from random import shuffle
from more_itertools import permutation_index, nth_permutation
n = 100
r = range(n)
for _ in range(3):
a = list(r)
shuffle(a)
i = permutation_index(a, r)
compressed = i.to_bytes(-(i.bit_length() // -8), 'big')
decompressed = list(nth_permutation(r, n, int.from_bytes(compressed, 'big')))
print(len(compressed), decompressed == a)
Should usually output (compressed bytes size and whether it was correct, three random attempts):
66 True
66 True
66 True
For n=10000, it usually compresses to 14808 bytes and still takes less than a second.