I have the following dataset. None is defined as a python missing value. The type is object (from dt.types)
import pandas as pd
import numpy as np
df = pd.DataFrame(columns=['triparty'])
df["triparty"] = ["AB65", "None", "GDW322", "DASED", "None"]
I want to create a dummy that takes the value 1 when triparty is None and 0 otherwise. I tried out several variations of
df["triparty"]=[0 if df["triparty"] == np.NaN else 1 for x in df["triparty"]]
df["triparty"]=[0 if df["triparty"] == "None" else 1 for x in df["triparty"]]
but it does not seem to work. I get the error message ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
How can I solve the problem?
You can do it with np.where
df["dummy"] = np.where(df["triparty"] == "None", 0, 1)
print(df)
Or create column of bool
as int
type.
df["dummy"] = (df["triparty"] != "None").astype(int)
# or
df["dummy"] = (~(df["triparty"] == "None")).astype(int)
Output
triparty dummy
0 AB65 1
1 None 0
2 GDW322 1
3 DASED 1
4 None 0