Could you let know how to average column values with same value in the first column as below? I'd like to calculate calculate every n rows of all columns. for example, the average of all the values from a1~a6 with site value '1' equals to 3. (my data has numerous columns and rows)
raw = {'site':['1','1','1','1','2','2','2','2'],
'a1':['1','2','3','4','5','6','7','8'],
'a2':['2','3','4','5','6','7','8','9'],
'a3':['2','3','4','5','6','7','8','9'],
'a4':['2','3','4','5','6','7','8','9'],
'a5':['2','3','4','5','6','7','8','9'],
'a6':['2','3','4','5','6','7','8','9']
}
df_tmp = pd.DataFrame(raw,columns = ['site','a1','a2','a3','a4','a5','a6'])
print(df_tmp)
Thank you
IIUC, DataFrame.melt
+ mean for each site with GroupBy.mean
# df_tmp = df_tmp.astype(int) # get correct result
df_tmp.melt('site').groupby('site')['value'].mean()
Or:
# df_tmp = df_tmp.astype(int) # get correct result
df_tmp.set_index('site').stack().groupby(level=0).mean()
#df_tmp.set_index('site').stack().mean(level=0) # .mean(level=0) deprecated
Output
site
1 3.333333
2 7.333333
Name: value, dtype: float64