Search code examples
pythonpython-3.xdictionarypython-itertoolscartesian-product

Cartesian Product for two dictionaries python


ok so i've got two dictionaries.

dictionary_1 = {'status': ['online', 'Away', 'Offline'],
                'Absent':['yes', 'no', 'half day']}
dictionary_2 = {'healthy': ['yes', 'no'],
                'insane': ['yes', 'no']

Now i need to combine them so that i get a new dictionary with:

{'status': ['online', 'online', 'away', 'away', 'Offline', 'Offline'],
 'Absent': ['yes', 'yes', 'no', 'no', 'half day', 'half day'],
 'healthy': ['yes', 'no', 'yes', 'no', 'yes', 'no'],
 'insane': ['yes', 'no', 'yes', 'no', 'yes', 'no']
}

This is an update which is very late but I found a way to do it without itertools if anyone is interested.

def cartesian_product(dict1, dict2):
    cartesian_dict = {}
    dict1_length = len(list(dict1.values())[0])
    dict2_length = len(list(dict2.values())[0])
    h = []
    for key in dict1:
        for value in dict1[key]:
            if not key in cartesian_dict:
                cartesian_dict[key] = []
                cartesian_dict[key].extend([value]*dict2_length)
            else:   
                cartesian_dict[key].extend([value]*dict2_length)
    for key in dict2:
        cartesian_dict[key] = dict2[key]*dict1_length
    return cartesian_dict

Solution

  • Best guess, based on @abarnert's interpretation (and assuming that the healthy and insane values in the current output are wrong, as they only have four members):

    d1 = {'status': ['online', 'Away', 'Offline'] ,'absent':['yes', 'no', 'half day']}
    d2 = {'healthy': ['yes', 'no'], 'insane': ['yes', 'no']}
    d1_columns = zip(*d1.values())
    d2_columns = zip(*d2.values())
    col_groups = [c1+c2 for c1, c2 in itertools.product(d1_columns, d2_columns)]
    rows = zip(*col_groups)
    combined_keys = list(d1) + list(d2)
    d_combined = dict(zip(combined_keys, rows))
    

    which produces

    >>> pprint.pprint(d_combined)
    {'absent': ('yes', 'yes', 'no', 'no', 'half day', 'half day'),
     'healthy': ('yes', 'no', 'yes', 'no', 'yes', 'no'),
     'insane': ('yes', 'no', 'yes', 'no', 'yes', 'no'),
     'status': ('online', 'online', 'Away', 'Away', 'Offline', 'Offline')}
    

    or, in your order,

    >>> order = ["status", "absent", "healthy", "insane"]
    >>> for k in order:
        print k, d_combined[k]
    ...     
    status ('online', 'online', 'Away', 'Away', 'Offline', 'Offline')
    absent ('yes', 'yes', 'no', 'no', 'half day', 'half day')
    healthy ('yes', 'no', 'yes', 'no', 'yes', 'no')
    insane ('yes', 'no', 'yes', 'no', 'yes', 'no')