I have a dataframe with comma separated values that i split out out using pd.concat.
original df:
org country type
Orange USA, GBR, AUS OWF, PMR, KIQ
Red AUS, RUS, NZL DOG, MOP, LOF
splitting out columns gives me a df, which we'll call df_wide,
org country_1 country_2 country_3 type_1 type_2 type_3
Orange USA GBR AUS OWF PMR KIQ
Watermelon AUS RUS NZL ODG MOP LOF
From the above dataframe, i need to get every possible combination of a single country and single type in a long format:
org country type
Orange USA OWF
Orange USA PMR
Orange USA KIQ
Orange GBR OWF
Orange GBR PMR
Orange GBR KIQ
..and so forth
and this is where I'm stuck. I mistakenly thought that i could just transform the dataframe using pd.wide_to_long, but I think the answer my revolves around using itertools. I've searched the forums that relate to this issue, but I still haven't quite figured it out. Looking for any suggestions! also the commas separated values in the original df columns could be dozens of values and therefore I don’t know how many columns wide my wide df will be.
Here's one solution using itertools.product
. It doesn't require the intermediary dataframe you created.
from itertools import chain, product
df = pd.DataFrame({'org': ['Orange', 'Red'],
'country': ['USA, GBR, AUS', 'AUS, RUS, NZL'],
'type': ['OWF, PMR, KIQ', 'DOG, MOP, LOF']})
split1 = df['country'].str.split(', ')
split2 = df['type'].str.split(', ')
lens = split1.map(len) * split2.map(len)
c_list, t_list = zip(*chain.from_iterable(map(product, split1, split2)))
res = pd.DataFrame({'org': np.repeat(df['org'], lens),
'country': c_list,
'type': t_list})
Explanation
The magic happens with this line:
c_list, t_list = zip(*chain.from_iterable(map(product, split1, split2)))
Working from the inside out:
split1
/ split2
.Result
print(res)
org country type
0 Orange USA OWF
0 Orange USA PMR
0 Orange USA KIQ
0 Orange GBR OWF
0 Orange GBR PMR
0 Orange GBR KIQ
0 Orange AUS OWF
0 Orange AUS PMR
0 Orange AUS KIQ
1 Red AUS DOG
1 Red AUS MOP
1 Red AUS LOF
1 Red RUS DOG
1 Red RUS MOP
1 Red RUS LOF
1 Red NZL DOG
1 Red NZL MOP
1 Red NZL LOF