Say I have a pandas column as below
Type
type1
type2
type3
and now i will take dummies for above as follows:
type_dummies = pd.get_dummies(["Type"], prefix="type")
Then after joing it with the main DataFrame the resulting df would be something like below:
df.drop(['Type'], axis=1, inplace=True)
df = df.join(type_dummies)
df.head()
type_type1 type_type2 type_type3
1 0 0
0 1 0
0 0 1
But what if in my training set there is an another category as type4
in Type
column. So how would I use get_dummies()
method to generate dummies as much as I want. That is, in this case I want to generate 4 dummy variables although there are only 3 categories in the desired column?
You can using categroy
data type
df.Type=df.Type.astype('category', categories=['type1','type2','type3','type4'])
df
Out[200]:
Type
0 type1
1 type2
2 type3
pd.get_dummies(df["Type"], prefix="type")
Out[201]:
type_type1 type_type2 type_type3 type_type4
0 1 0 0 0
1 0 1 0 0
2 0 0 1 0