I Am trying to predict a dataset, which has a column with different strings. For example, there are 3 brands, 'A', 'B', and 'C', and i want to replace them with numbers (0, 1 and 2, for example).
I know how to do that if there were only 2 brands, using pd.eq
,
I have tried to use set
, but i'd like to know if there is an easier method to do that, since i will have to replace it with columns that have more than 5 differente strings, and it would be pretty annoying.
You can replace them by selecting the records that match those condition, assuming you have your data in df
and the column of interest is 'Brand'
:
replacement = { 'A': 0, 'B': 1, 'C': 2 }
for key, value in replacement.items():
df.loc[df['Brand'] == key, 'Brand'] = value