I am analyzing the phonetic composition of words, and as part of this I have been using cartesian products to match spelling permutations with a given word. Each sound in a word can be represented by several spellings, and the program determines the correct spelling for each sound in a word. There are an unknown number of lists, of unknown length.
I am currently user itertools' product() inside of a list comprehension, i.e. brute-forced, every permutation checked before returning a value. Here is the relevant part in Python 3:
from itertools import product
def cartesian_match(string, iterables):
"""Gets the phonetic spelling breakdown of a word via cartesian product.
Args:
string (str): String for which a matched spelling is wanted.
iterables (list): A list of lists of unknown number and length.
Each sublist contains only str elements.
Each sublist contains all possible spellings of a
phoneme.
Returns:
list: the first matched list of spelling units.
Example (simplified):
Args:
string = "python"
iterables = [
'p', 'pp'],['i', 'ie', 'y', 'igh'],['th'],['or', 'ou', 'e', 'o'],[
'nd', 'nn', 'n', 'ne']
Returns:
['p', 'y', 'th', 'o', 'n']
"""
return [x for x in product(*iterables) if "".join(x) == string][0]
For complex words, the cartesian product is large, tens of millions of permutations. Some words take upwards of 15 minutes to compute. I have thousands of words to analyze so speed is currently an issue.
To speed things up, I need a function which returns the value as soon as it is discovered, rather than forming a cartesian product and having to run through each and every permutation. It would also allow me to optimize the sequence of elements inside each sub-list in order to get the matched value sooner.
My challenge is that I cannot figure out how to do this iteratively with an unknown number of lists of unknown length, and I've failed at any attempt to break out of a recursive function early.
Can anybody point me in the right direction?
for x in in product(*iterables):
if "".join(x) == string:
return x
BTW: your function is not recursive - the title of this question is misleading.