I have a DataFrame column with 3 values - Bart, Peg, Human. I need to one-hot encode them such that Bart and Peg stay as columns and human is represented as 0 0.
Xi | Architecture
0 | Bart
1 | Bart
2 | Peg
3 | Human
4 | Human
5 | Peg
..
.
I want to one-hot encode them so that Human is represented as 0 0:
Xi |Bart| Peg
0 | 1 | 0
1 | 1 | 0
2 | 0 | 1
3 | 0 | 0
4 | 0 | 0
5 | 0 | 1
But when I do :
pd.get_dummies(df['Architecture'], drop_first = True)
it removes "Bart" and keeps the other 2. Is there a way to specify which column to remove?
IIUC, try use get_dummies then drop 'Human' column:
df['Architecture'].str.get_dummies().drop('Human', axis=1)
Output:
Bart Peg
0 1 0
1 1 0
2 0 1
3 0 0
4 0 0
5 0 1