Let's assume I have a df that looks like this:
import pandas as pd
d = {'group': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'C'],
'number': [0, 3, 2, 1, 2, 1, -2, 1, 2, 3, 4, 2, 1, -1, 0]}
df = pd.DataFrame(data=d)
df
group number
0 A 0
1 A 3
2 A 2
3 A 1
4 A 2
5 B 1
6 B -2
7 B 1
8 B 2
9 B 3
10 C 4
11 C 2
12 C 1
13 C -1
14 C 0
And I would like to delete a whole group if one of its values in the number
column is negative. I can do:
df.groupby('group').filter(lambda g: (g.number < 0).any())
However this gives me the wrong output since this returns all groups with any rows that have a negative number in the number
column. See below:
group number
5 B 1
6 B -2
7 B 1
8 B 2
9 B 3
10 C 4
11 C 2
12 C 1
13 C -1
14 C 0
How do I change this function to make it return all groups without any negative numbers in the number
column. The output should be group A with its values.
Use the boolean NOT operator ~
:
df.groupby('group').filter(lambda g: ~(g.number < 0).any())
Or check if all
values don't match using De Morgan's Law:
df.groupby('group').filter(lambda g: (g.number >= 0).all())