Search code examples
pythonpandaspandas-groupby

Pandas - groupby column then specify condition


I have a dataframe df that looks like this:

   Batch   Fruit  Property1  Property2  Property3
0      1   Apple         38         55         52
1      1  Banana         59         37         47
2      1   Pear          62         34         25
3      2   Apple         95         64         48
4      2  Banana         10         84         39
5      2   Pear          16         87         38
6      3   Apple         29         34         49
7      3  Banana         27         41         51
8      3   Pear          35         33         17

For the dataframe, I want to add a column 'Status', which can have the value 'keep' or 'remove'. The condition is that all Fruits within Batch have column 'Status' == keep when:

  1. Apple has all Property1 < 30, Property2 < 40, Property3 < 50
  2. Banana has all Property1 < 35, Property2 < 45, Property3 < 55
  3. Pear has all Property1 < 37, Property2 < 46, Property3 < 53

Results should look like:

   Batch   Fruit  Property1  Property2  Property3 Status 
0      1   Apple         38         55         52  remove
1      1  Banana         59         37         47  remove
2      1   Pear          62         34         25  remove
3      2   Apple         95         64         48  remove
4      2  Banana         10         84         39  remove
5      2   Pear          16         87         38  remove
6      3   Apple         29         34         49    keep
7      3  Banana         27         41         51    keep
8      3   Pear          35         33         17    keep

Solution

  • Try this :

        df['Status']='remove'
        df['Status']=np.where((df['Property1']<30)&(df['Property2']<40)&(df['Property3']<50)&(df['Fruit']=='Apple'),'keep',df['Status'])
        df['Status']=np.where((df['Property1']<35)&(df['Property2']<45)&(df['Property3']<55)&(df['Fruit']=='Banana'),'keep',df['Status'])
        df['Status']=np.where((df['Property1']<37)&(df['Property2']<46)&(df['Property3']<53)&(df['Fruit']=='Pear'),'keep',df['Status'])