The itertools.islice()
method lets me generate a combination of characters from a charset starting from a given value. Running my method with Generate(1, 7, "abcde", "bdca")
will run perfectly fine.
However, when the integer is at the 'maximum' value (greater than 2147483647
) i get the error:
ValueError: Indices for islice() must be None or an integer: 0 <= x <= sys.maxsize.
How can i get itertools.islice
to take large start values?
I did try setting sys.maxsize to 'a large number' and converting startValue
to an integer explicitly; sys.maxsize = (len(charset) ** maxVal)
, but islice()
just ignores that.
This is the code I have come up with so far:
def checkValue(charset, word):
pos = len(charset)
value = 0
for i,c in enumerate(reversed(word)):
value+= (pos**i) * charset.index(c)
return value
def Generate(minVal, maxVal, charset, startFrom):
startValue = int(checkValue(charset, startFrom))
print(startValue)
allCombos = itertools.product(charset, repeat=len(startFrom))
combos = itertools.islice(allCombos, int(startValue), None) # error is here with 'startValue'
# generate from combo to end of length
for num, attempt in enumerate(combos, start=startValue):
generated = "".join(attempt)
print(num, generated)
# have to make new instance or skips a chunk for each length
for length in range(minVal + 1, maxVal + 1):
to_attempt = itertools.product(charset, repeat=length)
for attempt in to_attempt:
generated = "".join(attempt)
print(generated)
Generate(1, 15, "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890", "ADHkjdWCE")
Thanks for any help.
It's an implementation detail of islice
that can't be worked around directly without reimplementing it by hand.
If you move to a 64 bit build of Python, sys.maxsize
will jump from 2**31 - 1
to 2**63 - 1
, which is so large that actually running out a slice that long would not happen in any humanly reasonable amount of time.
Note: Your design here is a bad idea. islice
isn't magic; it still has to run out (discarding results as it goes) to reach the startValue
. Doing that 2+ billion times is going to take a long time. I'd suggest finding a way to directly begin iteration at a later point, not start from the beginning and discard 2+ billion items.