Search code examples
pythonpandasdataframeopenpyxlxlsxwriter

How to calculate the average of multiple iterations using python?


Below is a table with results from multiple runs of different scenario ( like Test_2k, Result_2k, Test_1K, Result_5k). I need to calculate the average score for each scenario as shown in the table. Could you please suggest how can we calculate the average score for each scenario as in below table using python?

enter image description here


Solution

  • You can get the scenario by ignoring everything after the last "_" in the scenario column, then groupby:

    df["Average"] = df.groupby(df["Scenario"].str.rsplit(pat="_",n=1,expand=True)[0])["Score"].transform("mean")
    
    
    >>> df
            Scenario   Score   Average
    0     Test_2k_R1  0.1095  0.456360
    1     Test_2k_R2  0.7399  0.456360
    2     Test_2k_R3  0.7521  0.456360
    3     Test_2k_R4  0.0200  0.456360
    4     Test_2k_R5  0.6603  0.456360
    5     Test_5k_R1  0.6839  0.490467
    6     Test_5k_R2  0.5649  0.490467
    7     Test_5k_R3  0.2226  0.490467
    8   Result_2k_R1  0.3966  0.553400
    9   Result_2k_R2  0.7102  0.553400
    10  Result_5k_R5  0.8727  0.538967
    11  Result_5k_R3  0.1931  0.538967
    12  Result_5k_R7  0.5511  0.538967
    13    Test_1k_R9  0.4124  0.299950
    14    Test_1k_R1  0.5594  0.299950
    15    Test_1k_R$  0.1663  0.299950
    16   Test_1k_R12  0.0617  0.299950