My dataframe contains quarterly data and for some companies monthly data as well.
import pandas as pd
df = pd.DataFrame({'quarter': ['2010-1', '2010-2', '2010-3','2010-4', '2011-1'],
'volume_quarter': [450, 450, 450, 450, 450],
'volume_month_1': [150, 150, 150, 150, 150],
'volume_month_2': [160, 160, 160, 160, 160],
'volume_month_3': [140, 140, 140, 140, 140]})
df
Gives:
quarter volume_quarter volume_month_1 volume_month_2 volume_month_3
2010-1 450 150 160 140
2010-2 450 150 160 140
2010-3 450 150 160 140
2010-4 450 150 160 140
2011-1 450 150 160 140
With the following code:
pd.melt(df, id_vars = ['quarter'], value_vars=['volume_month_1', "volume_month_2", "volume_month_3"])
I get:
quarter variable value
0 2010-1 volume_month_1 150
1 2010-2 volume_month_1 150
2 2010-3 volume_month_1 150
3 2010-4 volume_month_1 150
4 2011-1 volume_month_1 150
5 2010-1 volume_month_2 160
6 2010-2 volume_month_2 160
7 2010-3 volume_month_2 160
8 2010-4 volume_month_2 160
9 2011-1 volume_month_2 160
10 2010-1 volume_month_3 140
11 2010-2 volume_month_3 140
12 2010-3 volume_month_3 140
13 2010-4 volume_month_3 140
14 2011-1 volume_month_3 140
Instead I'm trying to achieve the following:
quarter variable value
0 2010-1 volume_month_1 150
1 2010-1 volume_month_2 160
2 2010-1 volume_month_3 140
3 2010-2 volume_month_1 150
4 2010-2 volume_month_2 160
5 2010-2 volume_month_3 140
6 2010-3 volume_month_1 150
7 2010-3 volume_month_2 160
8 2010-3 volume_month_3 140
9 2010-4 volume_month_1 150
10 2010-4 volume_month_2 160
11 2010-4 volume_month_3 140
12 2011-1 volume_month_1 150
13 2011-1 volume_month_2 160
14 2011-1 volume_month_3 140
I'd like to achieve this, so I can run the Arima model on the montly values.
Million thanks in advance !
You only missed sorting, this line of code:
df = (
pd.melt(
df,
id_vars=["quarter"],
value_vars=["volume_month_1", "volume_month_2", "volume_month_3"],
)
.sort_values(by="quarter")
.reset_index(drop=True)
)
returns as you desired:
quarter variable value
0 2010-1 volume_month_1 150
1 2010-1 volume_month_2 160
2 2010-1 volume_month_3 140
3 2010-2 volume_month_1 150
4 2010-2 volume_month_2 160
5 2010-2 volume_month_3 140
6 2010-3 volume_month_1 150
7 2010-3 volume_month_2 160
8 2010-3 volume_month_3 140
9 2010-4 volume_month_1 150
10 2010-4 volume_month_2 160
11 2010-4 volume_month_3 140
12 2011-1 volume_month_1 150
13 2011-1 volume_month_2 160
14 2011-1 volume_month_3 140