a) How to find the largest five SNRs when the ID is repeated? And also I want all these three columns as the output. b) I also want the eliminated lines as the output.
FIT ID SNR
1011563.fit, J16142485-3141000 , 36
1011729.fit, J17210134-3757437 , 18
1011730.fit, J17210134-3757437 , 20
1011731.fit, J17210134-3757437 , 20
1011732.fit, J17210134-3757437 , 13
1011914.fit, J17210134-3757437 , 38
1011915.fit, J17210134-3757437 , 26
1011916.fit, J17210134-3757437 , 19
1011917.fit, J17210134-3757437 , 47
1011918.fit, J17210134-3757437 , 25 ´´´
Expected output for a.
FITS ID SNR
```8 1011917.fit J17210134-3757437 47
5 1011914.fit J17210134-3757437 38
0 1011563.fit J16142485-3141000 36
6 1011915.fit J17210134-3757437 26
9 1011918.fit J17210134-3757437 25
2 1011730.fit J17210134-3757437 20 ´´´
Output b)
``` FITS ID SNR
1 1011729.fit J17210134-3757437 18
6 1011915.fit J17210134-3757437 26
7 1011916.fit J17210134-3757437 19
8 1011917.fit J17210134-3757437 47´´´
As you can see SNR "6 1011915.fit J17210134-3757437 26" and
"8 1011917.fit J17210134-3757437 47" are repeated. But I want this only as output a and not b.
Use groupby
+head
on the sorted dataframe to get the indices, then slice:
idx = df.sort_values(by='SNR', ascending=False).groupby('ID').head(5).index
df2 = df.loc[idx]
output:
FITS ID SNR
0 1004234.fits J16355032-2814188 714
4 1004238.fits J16355032-2814188 690
11 1004245.fits J16355032-2814188 645
8 1004242.fits J16355032-2814188 635
9 1004243.fits J16355032-2814188 522
17 1005114.fits J22154748+4954052 227
16 1005113.fits J22154748+4954052 212
13 1004476.fits J22152631+4958343 162
19 1005116.fits J22154748+4954052 160
18 1005115.fits J22154748+4954052 148
15 1004478.fits J22152631+4958343 103
14 1004477.fits J22152631+4958343 76
12 1004475.fits J22152631+4958343 62
Other rows:
df3 = df.loc[df.index.difference(idx)]
output:
FITS ID SNR
1 1004235.fits J16355032-2814188 444
2 1004236.fits J16355032-2814188 331
3 1004237.fits J16355032-2814188 492
5 1004239.fits J16355032-2814188 491
6 1004240.fits J16355032-2814188 489
7 1004241.fits J16355032-2814188 382
10 1004244.fits J16355032-2814188 385