I wanted to change the column type to category with the following code:
df["Geography"] = df["Geography"].astype("category")
Then, use random forest algorithm as following:
X = df.drop('target', axis = 1)
y = df['target']
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.15, random_state = 123,stratify=y )
forest = RandomForestClassifier(n_estimators = 500, random_state = 1)
And when fitting the algorithm:
forest = RandomForestClassifier(n_estimators = 500, random_state = 1)
The following error occurs:
could not convert string to float: 'Spain'
Spain is a row in a geography column which I converted to categorical value. Why do I get an error?
your feature type has changed to "category", but categories could be names of countries, so if you need categories as numbers you could use the categorical index:
df["Geography"] = pd.CategoricalIndex(df["Geography"])