How can I write a Python generator that lazily generates all strings composed of lowercase English letters of up to a certain length1?
I have written my own solution (posted below as an answer), but I would like to see if there are any more elegant/efficient/fun solutions out there.
1 An infinite iterator would be pretty useless because it would just generate strings composed of only the character a
. This is because the lexicographical ordering of strings is not a well-order; it can be thought of as composed of an infinite sequence of infinitely nested sequences: (a
, (aa
, ...), (ab
, ...), ...), (b
, (ba
, ...), (bb
, ...), ...), ... The generator would never reach ab
since it has an infinite amount of predecessors.
Here is my solution:
import string
def lexstrings(max_length: int, alphabet=string.ascii_lowercase):
yield ""
if max_length == 0: return
for first in alphabet:
for suffix in lexstrings(max_length - 1, alphabet=alphabet):
yield first + suffix
Example:
>>> g = lexstrings(max_length=3, alphabet="ab")
>>> list(g)
['',
'a',
'aa',
'aaa',
'aab',
'ab',
'aba',
'abb',
'b',
'ba',
'baa',
'bab',
'bb',
'bba',
'bbb']
This might not be the best solution because it involves recursion and using the +
operator m
times to generate a string of length m
, which isn't efficient because Python generates copies of the intermediate results (since strings are immutable).
This implementation also "supports" the infinite version:
>>> g = lexstrings(-1)
>>> next(g)
''
>>> next(g)
'a'
>>> next(g)
'aa'
>>> next(g)
'aaa'
...