What is the meaning of scikit-learn GridSearchCV best_score_ when GridSearchCV is used with KerasRegressor?

I use scikit_learn.GridSearchCV to grid search Hyperparameters for my Keras neural network (for a regression problem). The output of my neural network is a real value:

#generate a model (createModel is a function which returns a keras.Sequential model)
model = keras.wrappers.scikit_learn.KerasRegressor(build_fn=createModel)

#run the GridSearch 
paramGrid = dict( epochs=[100, 250, 500], batch_size=[16, 32, 64] )
grid = sklearn.model_selection.GridSearchCV(estimator=model, param_grid=paramGrid, n_jobs=1, cv=5)

#obtain and print the result (X, y are some data)
grid_result = grid.fit(X, y)
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

I don't understand what exactly is the best_score_ member in the grid search result. Is it a gap between the theoritical values and the predicted values? This best_score_ is always negative (and quite big) on my examples, it doesn't make any sense to me.

Solution

When you don't pass a specific scoring metric, GridSearchCV will use the default score method of estimator.

In your example, you did not pass a metric to your grid search instance, so it will use the default score metric of KerasRegressor, which is the mean loss of predictions (according to the source code on GitHub). Hence, since you set cv=5, grid_result.best_score_ is the average of the mean loss on all 5 folds.

I suggest you set your own performance metric by passing a value for scoring. For example:

grid = sklearn.model_selection.GridSearchCV(estimator=model, param_grid=paramGrid,
                                            scoring='roc_auc', n_jobs=1, cv=5)

You can find a list of all the supported metrics here. You can also define your own.