I have a df1 that looks like this:
year site 1 2 3 year site 1 2 3 year site 1 2 3 year site 1 2 3
1991 A 4.1 5.9 4.1 1991 B 3.3 4.1 4.1 1991 C 4.1 0.6 4.1 1991 D 4.1 4.1 4.1
1992 A 6.2 5.7 6.2 1992 B 6.2 7.1 6.2 1992 C 6.2 6.2 6.2 1992 D 6.2 9.5 7.4
1993 A 2.6 1.9 4.7 1993 B 2.6 6.2 2.6 1993 C 5.4 8.3 2.6 1993 D 0.4 2.6 2.6
And, I am having trouble transposing the columns - which are months (1,2,3) - to rows for each site so that my reshaped df1 or df2 looks like this:
year month Site A Site B Site C Site D
1991 1 4.1 3.3 4.1 4.1
1991 2 5.9 4.1 0.6 4.1
1991 3 4.1 4.1 4.1 4.1
1992 1 6.2 6.2 6.2 6.2
1992 2 5.7 7.1 6.2 9.5
1992 3 6.2 6.2 6.2 7.4
1993 1 2.6 2.6 5.4 0.4
1993 2 1.9 6.2 8.3 2.6
1993 3 4.7 2.6 2.6 2.6
I have tried using 'melt' and 'stack' but I do not understand how to reference the repeating months (1,2,3). Thank you,
Try the following using hard index slicing and reshaping:
#Create input dataframe
np.random.seed(0)
df = pd.concat([pd.DataFrame({'year':[1991, 1992, 1993],
'site':[i]*3,
1:np.round(np.random.randint(2,8,3)+np.random.random(3),1),
2:np.round(np.random.randint(2,8,3)+np.random.random(3),1),
3:np.round(np.random.randint(2,8,3)+np.random.random(3),1)}) for i in [*'ABC']], axis=1)
# index slice columns of the dataframe
df_out = pd.concat([df.iloc[:,i:i+5] for i in range(0,df.shape[1],5)])
# Reshape with melt, set_index, and unstack
df_out = df_out.melt(['year', 'site'], var_name='month')\
.set_index(['year', 'month', 'site'])['value']\
.unstack('site').add_prefix('Site ')\
.reset_index()
print(df_out)
Output:
site year month Site A Site B Site C
0 1991 1 6.6 6.0 5.5
1 1991 2 7.3 5.5 7.6
2 1991 3 3.9 2.5 4.7
3 1992 1 7.5 2.1 5.6
4 1992 2 4.1 2.8 7.9
5 1992 3 2.1 3.8 2.1
6 1993 1 2.4 5.9 4.0
7 1993 2 6.3 3.1 2.7
8 1993 3 3.1 3.1 3.7