df = pd.DataFrame({'Categotry':['Food','Animal'],
'Detail':[['Name','Color','Sweet?','Bread','Brown','No','Rice','White','No','Sushi','N/A','No'],
['Name','Predator?','Habitat','Tigers','Yes','Forests','Lions','Yes','Savanna','Deers','No','Hardwoods']]})
I have above dataframe and I want to split the Detail column as below:
How can I do that in Python?
Thanks for the help.
def process_details(details):
cols, *data = np.reshape(details, (-1, 3))
return pd.DataFrame(data, columns=cols)
I use np.reshape
because I'm used to it. However, this can accomplish the same thing.
def process_details(details):
cols, *data = zip(*[iter(details)] * 3)
return pd.DataFrame(data, columns=cols)
Because the column names don't match up
pd.concat({
cat: process_details(details)
for cat, details in zip(*map(df.get, df))
}, sort=False, axis=1)
Animal Food
Name Predator? Habitat Name Color Sweet?
0 Tigers Yes Forests Bread Brown No
1 Lions Yes Savanna Rice White No
2 Deers No Hardwoods Sushi N/A No
But if you insist on stacking them
pd.concat({
cat: process_details(details)
for cat, details in zip(*map(df.get, df))
}, sort=False)
Name Predator? Habitat Color Sweet?
Animal 0 Tigers Yes Forests NaN NaN
1 Lions Yes Savanna NaN NaN
2 Deers No Hardwoods NaN NaN
Food 0 Bread NaN NaN Brown No
1 Rice NaN NaN White No
2 Sushi NaN NaN N/A No