Search code examples
pythonpandascategorical

Type Error: Cannot set item on a categorical with a new category


I need to replace all values in the order column that are not equal to 'no', 'n/a' or 'N/A' by 1.0. I have tried converting it to a categorical variable and set the pre-existing categories as its distinct categories, but still get the same TypeError

df = pd.DataFrame({'otherdr': ['no', 'N/A', 'N/A', 'Intergov', 'Conciliation', 'yes']})
cat = list(df['otherdr'].unique())
df['otherdr'] = pd.Categorical(df['otherdr'], categories = cat, ordered = False)
df[df['otherdr'] != ('no' or 'n/a' or 'N/A')] = 1.0
TypeError: Cannot setitem on a Categorical with a new category (1.0), set the categories first

Solution

  • Don't use a categorical. Once defined, you cannot add a non existing category (well you can if you explicitly add a new category first).

    Use isin + where:

    df['otherdr'] = df['otherdr'].where(df['otherdr'].isin(['no', 'n/a', 'N/A']), 1)
    

    If you really want/need a categorical, convert after replacing the values:

    df['otherdr'] = pd.Categorical(df['otherdr'].where(df['otherdr'].isin(['no', 'n/a', 'N/A']), 1))
    

    Output:

      otherdr
    0      no
    1     N/A
    2     N/A
    3       1
    4       1
    5       1