Search code examples
pandasmediangrouped-table

Pandas get median/average of pre-aggregated data


Assuming my data is already grouped how can I calculate median and other statistics?

Index  Value  Count
0      6      2
1      2      3
2      9      8

In the example above I want to get the median/average etc of column Value taking into account the column 'Count'

The actual values are 2,2,2,6,6,9,9,9,9,9,9,9,9 so my median would be 9.


Solution

  • IIUC, you can do for the average

    print ((df['Value']*df['Count']).sum()/df['Count'].sum())
    6.923076923076923
    

    and for the median, use np.repeat

    print (np.repeat(df['Value'], df['Count']).median())
    9.0