Search code examples
pythonperformancepython-itertoolswords

python program very slow


This program generates letter combinations and checks to see if they are words, but the program is extremely slow generating only a few words a second. please tell me why it is very slow, and what i need to make it faster

import itertools 

for p1 in itertools.combinations('abcdefghijklmnopqrstuvwxyz', 4):
    with open('/Users/kyle/Documents/english words.txt') as word_file:
        english_words = set(word.strip().lower() for word in word_file)

    def is_english_word(word):
        return word.lower() in english_words

    print ''.join(p1),"is", is_english_word(''.join(p1))

Solution

  • It is slow because you are re-reading a file for each loop iteration, and create a new function object. Neither of these two things are dependent on the loop variable; move these out of the loop to only run once.

    Furthermore, the simple function can be inlined; calling a function is relatively expensive. And don't call ''.join() twice, either. And you are only using lowercase letters to generate the words, so .lower() is redundant:

    with open('/Users/kyle/Documents/english words.txt') as word_file:
        english_words = set(word.strip().lower() for word in word_file)
    
    for p1 in itertools.combinations('abcdefghijklmnopqrstuvwxyz', 4):
        word = ''.join(p1)
        print '{} is {}'.format(word, word in english_words)
    

    Since you are generating words of length 4, you could save yourself some memory by only loading words of length 4 from your english words file:

    with open('/Users/kyle/Documents/english words.txt') as word_file:
        english_words = set(word.strip().lower() for word in word_file if len(word.strip()) == 4)