Num Year 1 2 3 4 5 6 7 8 9 10 11 12
111 2009 0 0 667 584 343 245 0 0 231 742 0 482
111 2010 0 10 624 4 143 445 20 991 0 742 0 482
112 2008 .....
Dataset is like this. Num, Year : Multiindex in Dataframe(pandas)
And desired output is scaled data of columns from 1 to 12 using group by Num. So I can compare the Usage(for each months) of Each year only with same 'Num'.
print(df_pivot.groupby('Num')[1].apply(lambda x: MinMaxScaler().fit_transform(x.values.reshape(-1,1))))
print(df_pivot.groupby('Num')[1].apply(lambda x: MinMaxScaler().fit_transform(x.values.reshape(-1,1))).loc[11111])
########
First Output is :
11111 [[0.0], [0.0], [0.8012012012012012], [0.690210...
11112 [[0.5682020471740098], [0.5516243880729862], [...
11115 [[0.0], [0.0], [0.9999999999999999], [0.747932...
-----------------
Second Output is :
[[0. ]
[0.97836918]
[0.91226151]
[1. ]
[0.68939118]
[0.51707376]
[0.50893183]]
If look at the First Output, there is no Year index but only the Num index. So the result is returned in form of pandas.Series but this is not the desired output.
Can somebody help to get the data like??
11111 2010 0.0 0.3 0.4 1.0 0.5 ....
11111 2011 0.8 ....
11111 2012 0.2 ....
....
Instead of:
df_pivot.groupby('Num')[1]
you have to write:
df_pivot.groupby(['Num','Year'])
For example:
print(df_pivot.groupby(['Num','Year']).apply(lambda x: MinMaxScaler().fit_transform(x.values.reshape(-1,1)))))