scikit-learn logistic-regression sklearn-pandas multiclass-classification

Why does `LogisticRegression` take target variable of type `object` without any error?

Am using basic LogisticRegression on data for which the target variable is multiclass.

I was expecting LogisticRegression to give some error when the fit() was called. But it didnt.

Does LogisticRegression handle such case by default? If yes, what transformations are applied to the target variable?

ddf = pd.DataFrame(
    [[1,2,3,4, "Blue"],
    [4,2,3,4, "Red"],
    [5,2,8,4, "Red"],
    [2,7,3,9, "Green"],
    [7,6,7,4, "Blue"]], columns=['A','B','C','D','E']
)
ddf
X = ddf[['A', 'B', 'C', 'D']]
y = ddf['E']
lr = LogisticRegression()
lr.fit(X, y)
preds = lr.predict(X)
print(preds)

Gives the output: ['Blue' 'Red' 'Red' 'Green' 'Blue']

Solution

Scikit-learn is able to handle string labels for all the classifiers by default, internally it creates a LabelEncoder object, have a look at the code here. String-class labels are encoded to integer values.