I have a dataframe with user_id and some informations about them
User_id type info
31 R*1005 no
31 R*10335 no
25 R*1005 no
25 R*243 no
25 R*4918 yes
25 R*9017 no
25 R*9015 no
46 R*9470 no
I want to drop previous rows from user_id
when the column info
is "yes".
In the case above will be like:
User_id type info
31 R*1005 no
31 R*10335 no
25 R*9017 no
25 R*9015 no
46 R*9470 no
How to do this in a smart way?
Idea is test if at least one yes
in group and then for this group remove previous yes
rows:
m = df['info'].eq('yes')
g = m.groupby(df['User_id'])
m1 = g.transform('any')
m2 = g.cumsum().ne(0)
df = df[(~m1 | m2) & ~m]
print (df)
User_id type info
0 31 R*1005 no
1 31 R*10335 no
5 25 R*9017 no
6 25 R*9015 no
7 46 R*9470 no