Search code examples
pythonpython-3.xgeneratorlong-integerpython-itertools

itertool.islice value error when stop value is greater than sys.maxsize


The itertools.islice() method lets me generate a combination of characters from a charset starting from a given value. Running my method with Generate(1, 7, "abcde", "bdca") will run perfectly fine.

However, when the integer is at the 'maximum' value (greater than 2147483647) i get the error:

ValueError: Indices for islice() must be None or an integer: 0 <= x <= sys.maxsize.

How can i get itertools.isliceto take large start values?

I did try setting sys.maxsize to 'a large number' and converting startValue to an integer explicitly; sys.maxsize = (len(charset) ** maxVal), but islice() just ignores that.

This is the code I have come up with so far:

def checkValue(charset, word):
    pos = len(charset)
    value = 0
    for i,c in enumerate(reversed(word)):
        value+= (pos**i) * charset.index(c)
    return value

def Generate(minVal, maxVal, charset, startFrom):
    startValue = int(checkValue(charset, startFrom))
    print(startValue)
    allCombos = itertools.product(charset, repeat=len(startFrom))
    combos = itertools.islice(allCombos, int(startValue), None) # error is here with 'startValue'
    # generate from combo to end of length
    for num, attempt in enumerate(combos, start=startValue):
        generated = "".join(attempt)
        print(num, generated)
    # have to make new instance or skips a chunk for each length
    for length in range(minVal + 1, maxVal + 1):
        to_attempt = itertools.product(charset, repeat=length)
        for attempt in to_attempt:
            generated = "".join(attempt)
            print(generated)

Generate(1, 15, "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890", "ADHkjdWCE")

Thanks for any help.


Solution

  • It's an implementation detail of islice that can't be worked around directly without reimplementing it by hand.

    If you move to a 64 bit build of Python, sys.maxsize will jump from 2**31 - 1 to 2**63 - 1, which is so large that actually running out a slice that long would not happen in any humanly reasonable amount of time.

    Note: Your design here is a bad idea. islice isn't magic; it still has to run out (discarding results as it goes) to reach the startValue. Doing that 2+ billion times is going to take a long time. I'd suggest finding a way to directly begin iteration at a later point, not start from the beginning and discard 2+ billion items.