Search code examples
pythonlistformatcombinationscartesian-product

How to make combination of a string using list of words for each position


I have a list of strings like this:

original_text = "womens wear apparel bike"

Now, each word of original_text will have alternative words, like this list:

text_to_generate = [['females', 'ladies'], 'wear', ['clothing', 'clothes'], ['biking', 'cycling', 'running']]

I want to generate all possible phrases using the combination of the words in that list. I want something like this:

text1 = 'females wear clothing biking'
text2 = 'females wear clothes cycling'
text3 = 'ladies wear clothing biking'
text4 = 'ladies wear clothes cycling'
text5 = 'ladies wear clothes running'

The length of the word lists might not be all same.

This is what I have tried so far:

original_text = "womens wear apparel bike"
alternates_dict = {
    "mens": ["males"],
    "vitamins": ["supplements"],
    "womens": ["females", "ladies"],
    "shoes": ["footwear"],
    "apparel": ["clothing", "clothes"],
    "kids": ["childrens", "childs"],
    "motorcycle": ["motorbike"],
    "watercraft": ["boat"],
    "medicine": ["medication"],
    "supplements": ["vitamins"],
    "t-shirt": ["shirt"],
    "pram": ["stroller"],
    "bike": ["biking", "cycling"],
}

splitted = original_text.split()
for i in range(0,len(splitted)):
    if splitted[i] in alternates_dict.keys():
        splitted[i] = alternates_dict[splitted[i]]
        for word in splitted[i]:
            update  = original_text.replace(original_text.split()[i], word)
            print(update)
print(splitted)

Solution

  • You should take a look into the itertools module. You will find many useful combinatorics related built-ins, including product, which give the cartesian product of all the elements of the passed lists.

    In you case, you could do something like this:

    from itertools import product
    
    template = "{} wear {} {}"
    options = [
      ['women', 'females', 'ladies'],
      ['apparel', 'clothing', 'clothes'],
      ['biking', 'cycling', 'running']
    ]
    
    print(*(template.format(*ops) for ops in product(*options)), sep='\n')
    

    women wear apparel biking
    women wear apparel cycling
    women wear apparel running
    women wear clothing biking
    ...
    females wear clothing running
    females wear clothes biking
    ...
    ladies wear clothes running


    I see in your edit that you are using a dictionary as source of options. The solution in this case would be almost the same, but combining the dictionary values instead and joining them back with the keys with zip.

    from itertools import product

    template = "{women} wear {apparel} {bike}"
    options = {
      'women': ['females', 'ladies'],
      'apparel': ['clothing', 'clothes'],
      'bike': ['biking', 'cycling', 'running']
    }
    
    combos = (zip(options.keys(), vals) for vals in product(*options.values()))
    print(*(template.format(**dict(ops)) for ops in combos), sep='\n')