I'm fairly new to Pandas and often find myself stuck in situations that I have no solution to them rather than asking for help. I have read so many tutorials and checked many Stackoverflow topics but none of them asnwered mine since it is very difficult to search for. Would some please help me?
I have an excel file that I read as df. The df is something like this (I simplified it):
pd.DataFrame({'date': ['type', 'model', '2020', '2021', '2022', '2023'],
'factor1': ['type_1', 'model_1', '1', '2', '3', '4'],
'factor2': ['type_2', 'model_1', '5', '6', '7', '8'],
'factor3': ['type_3', 'model_2', '9', '10', '11', '12']
})
The result is:
date factor1 factor2 factor3
0 type type_1 type_2 type_3
1 model model_1 model_1 model_2
2 2020 1 5 9
3 2021 2 6 10
4 2022 3 7 11
5 2023 4 8 12
and the original data in the excel file that I based my dataframe definition off of is:
What I want to produce is something like this:
2020 2021 2022 2023
0 model_1 type_1 1 2 3 4
1 type_2 5 6 7 8
2 model_2 type_3 9 10 11 12
I tried any combo of melt, pivot and stack but the problem is for instance 'type' is horizontal but vertically listed under date! When melting, it looks at 'type' as a 'date' which is clearly wrong. Any help would be appreciated! Thanks!
I tried any combo of melt, pivot and stack..
Your forgot about T
:
tmp = df.T # or df.transpose()
out = (
tmp.iloc[1:]
.set_axis(tmp.iloc[0], axis=1)
.set_index(["model", "type"])
.rename_axis(index=[None]*2, columns=None) # optional
)
Output :
print(out)
2020 2021 2022 2023
model_1 type_1 1 2 3 4
type_2 5 6 7 8
model_2 type_3 9 10 11 12