Search code examples
pythonpython-itertoolscartesian-product

Python: associating a description to each combination of itertools.product


I have a function which runs a numerical simulation. I want to define a number of possible inputs for each parameter, and run the function on all the possible combinations. The way I'm doing it now is with itertools:

param1=['London','New York','Paris']
param2=[dataframe1,dataframe2]
param2_description =['optimistic assumptions','conservative assumptions' ]


myprod = itertools.product(param1, param2)
for i in myprod:
    myresult = myfunction(i[0],i[1])

My question is: how can I associate a description to each possible value of the parameters, and pass it to the function? In other words, when i[0]=dataframe1, how can I pass 'optimistic assumptions' to my function?

I thought about functions which look for an item in a list, but I'm not sure they would work with all objects, like a pandas dataframe.

Thanks! PS I don't have to use itertools at all costs, I can consider alternatives based on other approaches.


Solution

  • Why not simply pass the description together with the relevant dataframe as a tuple (or dict or any other datatype that suits your purpose) to itertools.product?

    Example:

    In [14]: myprod1 = itertools.product(param1, zip(param2, param2_description)) # remark: replaced the dataframes simply by letters
    
    In [15]: list(myprod1)
    Out[15]: 
    [('London', ('a', 'optimistic assumptions')),
     ('London', ('b', 'conservative assumptions')),
     ('New York', ('a', 'optimistic assumptions')),
     ('New York', ('b', 'conservative assumptions')),
     ('Paris', ('a', 'optimistic assumptions')),
     ('Paris', ('b', 'conservative assumptions'))]
    

    For each item in myprod1 you can now run the simulation on the dataframe (item[1][0]) while you have the description of that dataframe available in item[1][1]) and you'll have it for the different cartesian products made by itertools. You could also use a dictionary for this purpose, which is actually a good choice when you have meta-data describing your dataframes.