Speed up Pandas: find all columns which fullfill set of conditions

I have data represented using pandas DataFrame, which for example looks as follows:

| id | entity | name | value | location

where id is an integer value, entity is an integer , name is a string, value is an integer, and location is a string (for example US, CA, UK etc).

Now, I want to add a new column to this data frame, column "flag", where values are assigned as follows:

for d in df.iterrows():

    if d.entity == 10 and d.value != 1000 and d.location == CA:
        d.flag = "A" 
    elif d.entity != 10 and d.entity != 0 and d.value == 1000 and d.location == US:
        d.flag = "C"
    elif d.entity == 0 and d.value == 1000 and d.location == US"
        d.flag = "B"
    else:
        print("Different case")

Is there a way to speed this up and use some built in functions instead of the for loop?

Solution

Use np.select which you pass a list of conditions, based on those conditions you give it choices and you can specify a default value when none of the conditions is met.

conditions = [
    (d.entity == 10) & (d.value != 1000) & (d.location == 'CA'),
    (d.entity != 10) & (d.entity != 0) & (d.value == 1000) & (d.location == 'US'),
    (d.entity == 0) & (d.value == 1000) & (d.location == 'US')
]

choices = ["A", "C", "B"]

df['flag'] = np.select(conditions, choices, default="Different case")