I have a dataframe that looks like this below and I am trying to calculate a simple bias by comparing two columns of data - the column 'obsvals' and 'modelvals'. I need to subtract 'obsvals' from 'modelvals' at each month and sum those differences to compute the months 1 and 2 cumulative bias. I'm not sure how to do that in python. I'm guessing a combination of using groupby 'plant_name' and maybe a lambda function..?
Here is the dataframe:
plant_name year month obsvals modelvals Bias
0 ARIZONA I 2021 1 8.90 8.30 0.60
1 ARIZONA I 2021 2 7.98 7.41 0.57
3 CAETITE I 2021 1 9.10 7.78 1.32
4 CAETITE I 2021 2 6.05 6.02 0.03
My final answer should look like:
plant_name year Bias
0 ARIZONA I 2021 0.58
1 CAETITE I 2021 0.67
thank you for your time,
IIUC, you need groupby
:
df = df.groupby(['plant_name','year']).agg({'Bias': np.mean}).reset_index()
plant_name year Bias
0 ARIZONAI 2021 0.585
1 CAETITEI 2021 0.675