I have a df with weather reporting data. It has over 2 million rows and the following columns.
ID MONTH TEMP
1 1 0
1 1 10
2 1 50
2 1 60
3 1 80
3 1 90
1 2 0
1 2 10
2 2 50
2 2 60
3 2 80
3 2 90
I am looking to create an column for the average monthly temperature. I need a faster way than for-loops. The values for average monthly temperature are from the TEMP column. I would like them to be specific to each ID for each MONTH.
ID MONTH TEMP AVE MONTHLY TEMP
1 1 0 5
1 1 10 5
2 1 50 55
2 1 60 55
3 1 80 85
3 1 90 85
1 2 0 5
1 2 10 5
2 2 50 55
2 2 60 55
3 2 80 85
3 2 90 85
Use groupby.transform
:
df['AVE MONTHLY TEMP']=df.groupby(['ID','MONTH'])['TEMP'].transform('mean')
print(df)
Output
ID MONTH TEMP AVE MONTHLY TEMP
0 1 1 0 5
1 1 1 10 5
2 2 1 50 55
3 2 1 60 55
4 3 1 80 85
5 3 1 90 85
6 1 2 0 5
7 1 2 10 5
8 2 2 50 55
9 2 2 60 55
10 3 2 80 85
11 3 2 90 85