I'm trying to evaluate some images based on the classification. I use the piece of code below to read the csv file:
import pandas as pd
file = pd.read_csv('test.csv', header=None)
So I have something that looks like this:
Image1 2 3 4 5 Green
Image1 3 4 5 6 Red
Image2 4 5 6 7 Red
Image3 1 4 8 9 Green
Image4 5 3 0 1 Yellow
Image4 6 2 1 1 Green
So in case I want to keep the images with the value "Green" the output should look like this:
Image1 2 3 4 5 Green
Image1 3 4 5 6 Red
Image3 1 4 8 9 Green
Image4 5 3 0 1 Yellow
Image4 6 2 1 1 Green
which means that I want to keep the images with the same id in the first column when there is at least one with the element I check is in the last column.
I used the isin
method but I don't know how to keep the images the rest of the rows with the images that do have at least on time the value "Green" in the last column.
We can use GroupBy.any
here, where we check if any of the rows suffice our condition:
df[df[5].eq("Green").groupby(df[0]).transform("any")]
0 1 2 3 4 5
0 Image1 2 3 4 5 Green
1 Image1 3 4 5 6 Red
3 Image3 1 4 8 9 Green
4 Image4 5 3 0 1 Yellow
5 Image4 6 2 1 1 Green