data = {'ITEM': ['a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'd', 'z', 'z', 'z', 'z'],
'col1': [37, 84, 54, 72, 49, 20, 64, 75, 90, 71, 84, 46, 78, 24, 34],
'col2': [27, 44, 43, 12, 92, 10, 24, 55, 50, 37, 44, 57, 38, 64, 17],
'col3': [17, 54, 64, 68, 39, 50, 34, 65, 40, 72, 44, 56, 48, 94, 37], }
want = data.groupby('ITEM').agg({'col1' : ['median', 'std','rabbit'],
'col2' : ['median', 'std','rabbit'],
'col3' : ['median', 'std','rabbit']
})
I want to combine 'col1's' average with 'col1's std times 4. And I call it as "rabbit"
and want to make new column ['average', 'std', 'rabbit']
Are you looking for:
>>> (df.groupby('ITEM', as_index=False)[['col1', 'col2', 'col3']]
.agg(lambda x: x.mean() + 4 * x.std()))
ITEM col1 col2 col3
0 a 134.107944 163.886325 131.663437
1 b 169.413058 121.782536 111.677418
2 c 134.240115 80.269553 146.509668
3 d NaN NaN NaN
4 z 139.336738 128.348483 157.778615
Or maybe:
>>> pd.concat([df,
df.groupby('ITEM')[['col1', 'col2', 'col3']]
.transform(lambda x: x.mean() + 4 * x.std()).add_prefix('rabbit_')],
axis=1)
ITEM col1 col2 col3 rabbit_col1 rabbit_col2 rabbit_col3
0 a 37 27 17 134.107944 163.886325 131.663437
1 a 84 44 54 134.107944 163.886325 131.663437
2 a 54 43 64 134.107944 163.886325 131.663437
3 a 72 12 68 134.107944 163.886325 131.663437
4 a 49 92 39 134.107944 163.886325 131.663437
5 b 20 10 50 169.413058 121.782536 111.677418
6 b 64 24 34 169.413058 121.782536 111.677418
7 b 75 55 65 169.413058 121.782536 111.677418
8 c 90 50 40 134.240115 80.269553 146.509668
9 c 71 37 72 134.240115 80.269553 146.509668
10 d 84 44 44 NaN NaN NaN
11 z 46 57 56 139.336738 128.348483 157.778615
12 z 78 38 48 139.336738 128.348483 157.778615
13 z 24 64 94 139.336738 128.348483 157.778615
14 z 34 17 37 139.336738 128.348483 157.778615