Search code examples
pythonstringpython-itertools

Speeding up a function that outputs products with repeats in Python


I've written a function which returns a list of all possible 'products' of a string with repeats (for a spell check program):

def productRepeats(string):
    comboList = []

    for item in [p for p in product(string, repeat = len(list(string)))]:
        if item not in comboList:
            comboList.append("".join(item))

    return list(set(comboList))

Therefore, when print(productRepeats("PIP")) is inputted, he output is (I don't care what the order is):

['PII', 'IIP', 'PPI', 'IPI', 'IPP', 'PPP', 'PIP', 'III']

However, if I try anything greater than 5 digits (PIIIIP), it takes about 30 seconds to output, even though there are only 64 ways

Is there any way I could speed this up, as getting the list for the string 'GERAWUHGP', for instance, takes well over half an hour?


Solution

  • There are a couple of tricks you can use:

    1. Use a list comprehension or map to perform your iteration.
    2. As @jwodder explains, use set(string) to avoid checking for duplicates at a later stage.

    Here's a demo. I'm seeing a ~900x improvement for "hello":

    from itertools import product
    
    def productRepeats(string):
        comboList = []
    
        for item in [p for p in product(string, repeat = len(list(string)))]:
            if item not in comboList:
                comboList.append("".join(item))
    
        return list(set(comboList))
    
    def productRepeats2(string):
        return list(map(''.join, product(set(string), repeat=len(string))))
    
    assert set(productRepeats2('hello')) == set(productRepeats('hello'))
    
    %timeit productRepeats('hello')   # 127 ms
    %timeit productRepeats2('hello')  # 143 µs