Below is a table with results from multiple runs of different scenario ( like Test_2k, Result_2k, Test_1K, Result_5k). I need to calculate the average score for each scenario as shown in the table. Could you please suggest how can we calculate the average score for each scenario as in below table using python?
You can get the scenario by ignoring everything after the last "_" in the scenario column, then groupby
:
df["Average"] = df.groupby(df["Scenario"].str.rsplit(pat="_",n=1,expand=True)[0])["Score"].transform("mean")
>>> df
Scenario Score Average
0 Test_2k_R1 0.1095 0.456360
1 Test_2k_R2 0.7399 0.456360
2 Test_2k_R3 0.7521 0.456360
3 Test_2k_R4 0.0200 0.456360
4 Test_2k_R5 0.6603 0.456360
5 Test_5k_R1 0.6839 0.490467
6 Test_5k_R2 0.5649 0.490467
7 Test_5k_R3 0.2226 0.490467
8 Result_2k_R1 0.3966 0.553400
9 Result_2k_R2 0.7102 0.553400
10 Result_5k_R5 0.8727 0.538967
11 Result_5k_R3 0.1931 0.538967
12 Result_5k_R7 0.5511 0.538967
13 Test_1k_R9 0.4124 0.299950
14 Test_1k_R1 0.5594 0.299950
15 Test_1k_R$ 0.1663 0.299950
16 Test_1k_R12 0.0617 0.299950