Search code examples
pandascategories

what is ".cat" in ".cat.categories"?


I am not sure what .cat is and why we need it while .categories itself is already asking the interpreter to list the category of choice ? For example, why should I use this:

housing_df.REMODEL.cat.categories

And not this:

housing_df.REMODEL.categories

Pandas documentation explains "String.cat.categories" but I do not understand that.


Solution

  • .cat is called an accessor object. It allows you to access attributes and methods that are specific to categorical columns. There are similar accessors in pandas: .dt for date-time, .str for string, .plot for plotting and so on. This is a design decision made by pandas developers. In general, these are used for namespacing so that attributes and methods are better organized and they don't cause name collision.