I'm working on this dataset:
# dummy data
import pandas as pd
data = pd.DataFrame({None : ['Company', 'AAA', 'BBB', 'CCC','Company', 'AAA' ],
None : ['Copper', 'Copper', 'Iron', 'Iron', 'Gold', 'Gold'],
"NaN" : ['Net','Gross', 'Net','Gross', 'Net','Gross' ],
"11/01/2021" : [1,2,3,4,5,6],
"12/01/2021" : [10,11,12,13,14,15],
"13/01/2021" : [19,20,21,22,23,25]})
data
And I'm trying to get this output:
I tried with stack/melt but I wasn't able to get the data in this format.
Thanks
You can use .melt()
to transform the table and then use .rename()
and .sort_values()
and .reset_index()
to format to the desired column names and row sequence:
(data.rename({None: 'Material', 'NaN': 'Revenue'}, axis=1)
.melt(id_vars=['Material', 'Revenue'], var_name='Month', value_name='Value')
.sort_values(['Material', 'Revenue'], ascending=[True, False])
).reset_index(drop=True)
Result:
Material Revenue Month Value
0 Copper Net 11/01/2021 1
1 Copper Net 12/01/2021 10
2 Copper Net 13/01/2021 19
3 Copper Gross 11/01/2021 2
4 Copper Gross 12/01/2021 11
5 Copper Gross 13/01/2021 20
6 Gold Net 11/01/2021 5
7 Gold Net 12/01/2021 14
8 Gold Net 13/01/2021 23
9 Gold Gross 11/01/2021 6
10 Gold Gross 12/01/2021 15
11 Gold Gross 13/01/2021 25
12 Iron Net 11/01/2021 3
13 Iron Net 12/01/2021 12
14 Iron Net 13/01/2021 21
15 Iron Gross 11/01/2021 4
16 Iron Gross 12/01/2021 13
17 Iron Gross 13/01/2021 22