I train different classifiers with different set of data and I need to understand how can I properly measure effectiveness of classifier.
Here's my code:
iris = load_iris()
param_grid = {
'criterion': ['gini', 'entropy'],
'max_depth': np.arange(4, 6)
}
tree = GridSearchCV(DecisionTreeClassifier(), param_grid)
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target)
tree.fit(X_train, y_train)
tree_preds = tree.predict(X_test)
tree_performance = accuracy_score(y_test, tree_preds)
print 'Best params: ', tree.best_params_
print 'Best score: ', tree.best_score_
print 'DecisionTree score: ', tree_performance
Question is, what actually is a best score from GridSearchCV? And how it differs from results used in accuray_score
function?
As I understand, accuracy_score
takes classes of test set and compares it with results predicted by algorithm. The result is the percentage of properly classified items. But what is best_score_
?
These two values are different and an example output from my script looks like this:
Best score: 0.955357142857
DecisionTree score: 0.947368421053
GridSearchCV
does not take into account your test set (looking closely, you'll see that you don't pass your test set in tree.fit()
); the score it reports, best_score_
, comes from a cross-validation (CV) in your training set. From the docs:
best_score_ : float
Mean cross-validated score of the best_estimator
This score itself (0.955 in your example) is the mean value of the score in each one of the (default, since you have not specified the cv
argument) 3 CV folds.
Your accuracy_score
, on the other hand, comes from your test set.
That clarified, it should be obvious that these two numbers are not the same; on the other hand, and provided that both the CV procedure and the train-test split have been performed correctly, they should not be much different, too, which is arguably your case here.