I have a dataframe with a column like this:
POLITICS
BUSINESS
TRAVEL
SPORTS
....
DIVORCE
ARTS
WELLNESS
CRIME
e.g
import pandas as pd
data = [['CRIME', 10], ['BUSINESS', 15], ['SPORTS', 12], ['TRAVEL', 2], ['WELLNESS', 3], ['ARTS', 25]]
df = pd.DataFrame(data, columns=['category', 'no'])
df
I want to add a column 'label' and map four categories to labels like so
label_dict = {'CRIME':1, 'BUSINESS':2, 'SPORTS':3 'ARTS':4}
and then all of the remaining categories should be labeled as 5. I have tried this and am getting a KeyError: 'label'.
df['label'] = df['category'].apply( lambda x : label_dict[x] if x in label_dict.keys() else 5)
How can I achieve this?
Try with map
:
df['label'] = df['category'].map(label_dict).fillna(5).astype(int)
print(df)
# Output
category no label
0 CRIME 10 1
1 BUSINESS 15 2
2 SPORTS 12 3
3 TRAVEL 2 5
4 WELLNESS 3 5
5 ARTS 25 4
Or with replace
:
df['label'] = df['category'].replace(label_dict | {'.*': 5}, regex=True)
Or suggested by @mozway:
df['label'] = df['category'].map(lambda x: label_dict.get(x, 5))