I would like to filter dataframe by lambda if condition
I have a "product name" and "category1" columns and "if product name" not contains ("boxer","boxers","sock","socks") words I would like to change "category1" column as "Other", but below code change all of them as "other" example even contains "sock"
df = pd.DataFrame({
'product_name': ["blue shirt", " medium boxers", "red jackets ", "blue sock"],})
df["category1"]=df.apply(lambda x: "Other" if ("boxer","boxers","sock","socks" not in x["product_name"] ) else x["category1"], axis=1)
I expected below results
df = pd.DataFrame({
'product_name': ["blue shirt", " medium boxers", "red jackets ", "blue sock"],
'category1'["other", Nan, "other ", "Nan"],})
Thank you for your support
You could use str.contains
:
items = ("boxer","boxers","sock","socks")
import numpy as np
df["category1"] = np.where(df['product_name'].str.contains('|'.join(items)),
np.nan, # value is True
'Other') # value if False
output:
product_name category1
0 blue shirt Other
1 medium boxers nan
2 red jackets Other
3 blue sock nan