I am quite new to Pandas and I am trying to do the following thing:
I have two dataframes comms
and arts
that look like this (except for the fact they are longer ad with other columns)
comms:
ID commScore
10 5
10 3
10 -1
11 0
11 2
12 9
13 -2
13 -1
13 1
13 4
arts:
ID commNumber
10 3
11 2
12 1
13 4
I need to group comms
by their ID, and then save in arts
(obviously in the correct ID line) the interquartile range (IQR) of the commScore distribution for each ID.
I've already tried using groupby
, agg
and map
, but since my notions of pandas
are quite limited, I just couldn't do what I was looking for.
Does anyone have a solution?
Thanks
Andrea
We can group
the dataframe by ID
and aggregate column commScore
using the function iqr
from scipy.stats
to calculate inter quartile range, then map
this calculated iqr
range on the column ID
of the arts
dataframe
from scipy.stats import iqr
arts['IQR'] = arts['ID'].map(comms.groupby('ID')['commScore'].agg(iqr))
ID commNumber IQR
0 10 3 3
1 11 2 1
2 12 1 0
3 13 4 3