Search code examples
pythonpandasdataframelist-comprehension

How to add labels in panda dataframe columns with else condition?


I have a dataframe with a column like this:

POLITICS          
BUSINESS 
TRAVEL         
SPORTS
....
DIVORCE
ARTS
WELLNESS
CRIME

e.g

import pandas as pd

data = [['CRIME', 10], ['BUSINESS', 15], ['SPORTS', 12],  ['TRAVEL', 2], ['WELLNESS', 3], ['ARTS', 25]]
  

df = pd.DataFrame(data, columns=['category', 'no'])
df

I want to add a column 'label' and map four categories to labels like so

label_dict = {'CRIME':1, 'BUSINESS':2, 'SPORTS':3  'ARTS':4}

and then all of the remaining categories should be labeled as 5. I have tried this and am getting a KeyError: 'label'.

df['label'] = df['category'].apply( lambda x : label_dict[x] if x in label_dict.keys() else 5)

How can I achieve this?


Solution

  • Try with map:

    df['label'] = df['category'].map(label_dict).fillna(5).astype(int)
    print(df)
    
    # Output
       category  no  label
    0     CRIME  10      1
    1  BUSINESS  15      2
    2    SPORTS  12      3
    3    TRAVEL   2      5
    4  WELLNESS   3      5
    5      ARTS  25      4
    

    Or with replace:

    df['label'] = df['category'].replace(label_dict | {'.*': 5}, regex=True)
    

    Or suggested by @mozway:

    df['label'] = df['category'].map(lambda x: label_dict.get(x, 5))