Search code examples
pythonpandasfunctiondataframecalculated-columns

Creating a Pandas column based on values of another column using function


I would like to identify doctors based on their title in a dataframe and create a new column to indicate if they are a doctor but I am struggling with my code.

doctorcriteria = ['Dr', 'dr']

def doctor(x):
  if doctorcriteria in x:
    return 'Doctor'
  else:
    return 'Not a doctor'

df['doctorcall'] = df.caller_name
df.doctorcall.fillna('Not a doctor', inplace=True)
df.doctorcall = df.doctorcall.apply(doctor)

Solution

  • To create a new column with a function, you can use apply:

    df = pd.DataFrame({'Title':['Dr', 'dr', 'Mr'],
                   'Name':['John', 'Jim', 'Jason']})
    
    doctorcriteria = ['Dr', 'dr']
    
    def doctor(x):
        if x.Title in doctorcriteria:
            return 'Doctor'
        else: return 'Not a doctor'
    
    df['IsDoctor'] = df.apply(doctor, axis=1)
    

    But a more direct route to the answer would be to use map on the Title column.

    doctor_titles = {'Dr', 'dr'}
    
    df['IsDoctor'] = df['Title'].map(lambda title: title in doctor_titles)