Python Pandas Dataframe: Make Rows from Multiple Column Into Multiple Column

I have dataframe like this:

Product	Colour	Grade	Price	Size
Apple	Yellow	A	1000	3
Apple	Green	B	500	4
Banana	Yellow	B	2000	5
Banana	Green	A	3000	6

I want to convert the dataframe to:

Product	Price_Yellow_A	Size_Yellow_A	Price_Yellow_B	Size_Yellow_B	Price_Green_A	Size_Green_A	Price_Green_B	Size_Green_B
Apple	1000	3	0	0	0	0	500	4
Banana	0	0	2000	5	3000	6	0	0

How to do this?

Solution

You can use pd.pivot() to achieve this, with some other intermediary steps.

First create a column of the combined Colour and Grade like this:

df['Combined'] = df['Colour'] + '_' + df['Grade']

0    Yellow_A
1     Green_B
2    Yellow_B
3     Green_A
Name: Combined, dtype: object

Then pivot() the dataframe, keeping Product as your index, using the new Combined column as columns and the Price and Size columns as values. I added .fillna(0) to put 0s in the blanks. Like this:

new_df = df.pivot(index='Product', columns=['Combined'], values=['Price', 'Size']).fillna(0)

           Price                              Size                          
Combined Green_A Green_B Yellow_A Yellow_B Green_A Green_B Yellow_A Yellow_B
Product                                                                     
Apple        0.0   500.0   1000.0      0.0     0.0     4.0      3.0      0.0
Banana    3000.0     0.0      0.0   2000.0     6.0     0.0      0.0      5.0

This returns the desired table, but with mutli-level columns, so we can use a bit of list comprehension to rename the columns to the desired column names, like this:

new_df.columns = [col[0]+'_'+col[1] for col in new_df.columns]

         Price_Green_A  Price_Green_B  Price_Yellow_A  Price_Yellow_B  Size_Green_A  Size_Green_B  Size_Yellow_A  Size_Yellow_B
Product                                                                                                                        
Apple              0.0          500.0          1000.0             0.0           0.0           4.0            3.0            0.0
Banana          3000.0            0.0             0.0          2000.0           6.0           0.0            0.0            5.0

Full code:

df['Combined'] = df['Colour'] + '_' + df['Grade']
new_df = df.pivot(index='Product', columns=['Combined'], values=['Price', 'Size']).fillna(0)
new_df.columns = [col[0]+'_'+col[1] for col in new_df.columns]