I have a pandas dataframe with a field called "promo_type" which I converted to categorical by using astype:
df['promo_type'] = df['promo_type'].astype('category')
Later on in the code I want to add another category to the field, as follows:
df['promo_type'].add_categories('0')
And I get this error:
AttributeError: 'Series' object has no attribute 'add_categories'
I have checked that my pandas version does have add_categories, and that add_categories is an available method for df['promo_type'].
I have no idea why this isn't working.
Thanks for the help in advance.
You missed the cat
accessor. You have to use Series.cat.add_categories
:
df['promo_type'] = df['promo_type'].cat.add_categories('0')
Setup:
df = pd.DataFrame({'promo_type': ['a', 'b', 'c']}).astype('category')
print(df['promo_type'])
# Output
0 a
1 b
2 c
Name: promo_type, dtype: category
Categories (3, object): ['a', 'b', 'c']
Add category:
df['promo_type'] = df['promo_type'].cat.add_categories('0')
print(df['promo_type'])
# Output
0 a
1 b
2 c
Name: promo_type, dtype: category
Categories (4, object): ['a', 'b', 'c', '0'] # <- HERE
Update
You can use add_categories
without cat
accessor only if you use a CategoricalIndex
:
df = pd.DataFrame({'promo_type': ['a', 'b', 'c']})
catx = pd.CategoricalIndex(df['promo_type'])
print(catx)
# Output
CategoricalIndex(['a', 'b', 'c'], categories=['a', 'b', 'c'], ordered=False, dtype='category', name='promo_type')
Modify category:
catx = catx.add_categories('0')
print(catx)
# Output
CategoricalIndex(['a', 'b', 'c'], categories=['a', 'b', 'c', '0'], ordered=False, dtype='category', name='promo_type')