To say I have a dataframe with three columns like:
index | A | B | C |
---|---|---|---|
1 | foo | One | 1 |
2 | foo | Two | 2 |
3 | foo | Three | 3 |
4 | bar | One | 2 |
5 | bar | One | 1 |
6 | num | Two | 3 |
7 | num | Three | 3 |
In this case, how may I filter out the rows that have the same value in column B but more than one respective value in column C by using Python Pandas?
The rows that I need is 1, 2, 4, 5, 6 because "One" in column B has two corresponding values (1 and 2) in column C and "Two" in column B has two corresponding values as well. Eventually I want to group them by column A if possible.
You can try groupby
B
column then filter
by the value_counts
of C
column.
out = df.groupby('B').filter(lambda group: len(group['C'].value_counts()) > 1)
print(out)
index A B C
0 1 foo One 1
1 2 foo Two 2
3 4 bar One 2
4 5 bar One 1
5 6 num Two 3