I have this dataset
custID documentno quantity producttype
1 1281261658 3 accesories
1 1281261658 3 major
1 1281261658 3 accesories
2 0105162445 2 major
2 0105162445 2 major
3 0100848728 2 major
3 0100848728 2 accesories
4 0106075074 1 major
So, I want to filter the dataset in a way I want documentno's with atleast one 'accesories' in it. So I want to remove documents with only 'major' in it.
custID = {1,1,1,2,2,3,3,4}
documentno = {1281261658,1281261658,1281261658,0105162445,0105162445,0100848728,0100848728,0106075074}
quantity = {3,3,3,2,2,2,2,1}
producttype = {'accesories','major','accesories','major','major','major','accesories','major'}
the final result is:
id documentno quantity producttype
1 1281261658 3 accesories
1 1281261658 3 major
1 1281261658 3 accesories
3 0100848728 2 major
3 0100848728 2 accesories
I tried with if else, but I couldn't do it.
Thanks in advance
Given pd.DataFrame
df
:
df = pd.DataFrame({'id' : [1,1,1,2,2,3,3,4],
'documentno' : ['1281261658','1281261658','1281261658','0105162445','0105162445','0100848728','0100848728','0106075074'],
'quantity' : [3,3,3,2,2,2,2,1],
'producttype' : ['accesories','major','accesories','major','major','major','accesories','major']})
Try the following:
df.loc[df.documentno.isin(df.documentno[df.producttype.eq('accesories')].unique()),:]
Result is:
id documentno quantity producttype
0 1 1281261658 3 accesories
1 1 1281261658 3 major
2 1 1281261658 3 accesories
5 3 0100848728 2 major
6 3 0100848728 2 accesories
As a filter df.documentno[df.producttype.eq('accesories')]
is used which returns documentno
values that have one or more accesories
.
I assume that quantity
in dataset df
is always greatest than 0.
However if quantity
may be zero:
df = pd.DataFrame({'id' : [1,1,1,2,2,3,3,4],
'documentno' : ['1281261658','1281261658','1281261658','0105162445','0105162445','0100848728','0100848728','0106075074'],
'quantity' : [3,3,3,2,2,2,0,1],
'producttype' : ['accesories','major','accesories','major','major','major','accesories','major']})
just add condition df.quantity.gt(0)
:
df.loc[ df.documentno.isin(df.documentno[(df.producttype.eq('accesories') & df.quantity.gt(0))].unique()) ,:]