Search code examples
pythonpandasdataframepandas-groupbyquantile

Pandas: groupby and then retrieving IQR


I am quite new to Pandas and I am trying to do the following thing:

I have two dataframes comms and arts that look like this (except for the fact they are longer ad with other columns)

comms:

ID    commScore           
10       5                
10       3                  
10      -1                 
11       0                
11       2              
12       9      
13      -2     
13      -1     
13       1      
13       4

arts:

ID    commNumber
10        3 
11        2    
12        1
13        4      

I need to group comms by their ID, and then save in arts (obviously in the correct ID line) the interquartile range (IQR) of the commScore distribution for each ID.

I've already tried using groupby, agg and map , but since my notions of pandas are quite limited, I just couldn't do what I was looking for.

Does anyone have a solution?

Thanks

Andrea


Solution

  • We can group the dataframe by ID and aggregate column commScore using the function iqr from scipy.stats to calculate inter quartile range, then map this calculated iqr range on the column ID of the arts dataframe

    from scipy.stats import iqr
    
    arts['IQR'] = arts['ID'].map(comms.groupby('ID')['commScore'].agg(iqr))
    

       ID  commNumber  IQR
    0  10           3    3
    1  11           2    1
    2  12           1    0
    3  13           4    3