I have a data frame that looks like that : (the treatment has multiple possibily of character variable, I just simplified for the question)
ID Position Treatment
--20AxECvv- 0 A
--20AxECvv- -1 A
--20AxECvv- -2 A
--h9INKewQf- 0 A
--h9INKewQf- -1 B
--h9INKewQf- -2 B
--h9INKewQf- -3 B
--h9INKewQf- -4 B
--h9INKewQf- -5 B
--h9INKewQf- -6 B
--h9INKewQf- -7 B
zZU7a@8jN 0 B
QUeSNEXmdB 0 C
QUeSNEXmdB -1 C
qu72Ql@h79 0 C
I just want to keep the treatment of the minimum position for each ID with Python.
The result would be :
ID Position Treatment
--20AxECvv- -2 A
--h9INKewQf- -7 B
zZU7a@8jN 0 B
QUeSNEXmdB -1 C
qu72Ql@h79 0 C
I am a beginner, I think I have to use a groupby and filter by the minimum position but I have any idea how to write it. Thanks in advance :)
You can also sort your dataframe using df.sort_values
and drop duplicates using df.drop_duplicates
:
In [795]: df.sort_values('Position').drop_duplicates('ID')
Out[795]:
ID Position Treatment
10 --h9INKewQf- -7 B
2 --20AxECvv- -2 A
13 QUeSNEXmdB -1 C
11 zZU7a@8jN 0 B
14 qu72Ql@h79 0 C