I am having a dataframe(df) I described as below.
Packet Orgin Destination Delivery_Time
A1 NYK HAM 6
A1 NYK HAM 5
A1 NYK HAM 6
A1 NYK HAM 6
A1 NYK HAM 3
A1 NYK HAM 4
A1 NYK HAM 8
B1 HK JP 2
B1 HK JP 4
B1 HK JP 2
B1 HK JP 4
B1 HK JP 4
B1 HK JP 4
B1 HK JP 3
B1 HK JP 5
B1 HK JP 5
B1 HK JP 6
C1 CDG LUX 1
D1 MEX NYK 3
I want to caluclate the median of the dataframe (df) and attach back to the dataframe as new column as below
How can this done ?. I have around 50K records to groupby .
Use GroupBy.transform
with median
:
df['med'] = df.groupby('Packet')['Delivery_Time'].transform('median')
print (df)
Packet Orgin Destination Delivery_Time med
0 A1 NYK HAM 6 6
1 A1 NYK HAM 5 6
2 A1 NYK HAM 6 6
3 A1 NYK HAM 6 6
4 A1 NYK HAM 3 6
5 A1 NYK HAM 4 6
6 A1 NYK HAM 8 6
7 B1 HK JP 2 4
8 B1 HK JP 4 4
9 B1 HK JP 2 4
10 B1 HK JP 4 4
11 B1 HK JP 4 4
12 B1 HK JP 4 4
13 B1 HK JP 3 4
14 B1 HK JP 5 4
15 B1 HK JP 5 4
16 B1 HK JP 6 4
17 C1 CDG LUX 1 1
18 D1 MEX NYK 3 3