I am not sure what .cat
is and why we need it while .categories
itself is already asking the interpreter to list the category of choice ?
For example, why should I use this:
housing_df.REMODEL.cat.categories
And not this:
housing_df.REMODEL.categories
Pandas documentation explains "String.cat.categories" but I do not understand that.
.cat
is called an accessor object. It allows you to access attributes and methods that are specific to categorical columns. There are similar accessors in pandas: .dt
for date-time, .str
for string, .plot
for plotting and so on. This is a design decision made by pandas developers. In general, these are used for namespacing so that attributes and methods are better organized and they don't cause name collision.