Search code examples
pythonpandasdataframeconditional-statements

How do I assign values based on multiple conditions for existing columns?


I would like to create a new column with a numerical value based on the following conditions:

a. if gender is male & pet1==pet2, points = 5

b. if gender is female & (pet1 is 'cat' or pet1 is 'dog'), points = 5

c. all other combinations, points = 0

    gender    pet1      pet2
0   male      dog       dog
1   male      cat       cat
2   male      dog       cat
3   female    cat       squirrel
4   female    dog       dog
5   female    squirrel  cat
6   squirrel  dog       cat

I would like the end result to be as follows:

    gender    pet1      pet2      points
0   male      dog       dog       5
1   male      cat       cat       5
2   male      dog       cat       0
3   female    cat       squirrel  5
4   female    dog       dog       5
5   female    squirrel  cat       0
6   squirrel  dog       cat       0

How do I accomplish this?


Solution

  • You can do this using np.where, the conditions use bitwise & and | for and and or with parentheses around the multiple conditions due to operator precedence. So where the condition is true 5 is returned and 0 otherwise:

    In [29]:
    df['points'] = np.where( ( (df['gender'] == 'male') & (df['pet1'] == df['pet2'] ) ) | ( (df['gender'] == 'female') & (df['pet1'].isin(['cat','dog'] ) ) ), 5, 0)
    df
    
    Out[29]:
         gender      pet1      pet2  points
    0      male       dog       dog       5
    1      male       cat       cat       5
    2      male       dog       cat       0
    3    female       cat  squirrel       5
    4    female       dog       dog       5
    5    female  squirrel       cat       0
    6  squirrel       dog       cat       0