I use scikit_learn.GridSearchCV to grid search Hyperparameters for my Keras neural network (for a regression problem). The output of my neural network is a real value:
#generate a model (createModel is a function which returns a keras.Sequential model)
model = keras.wrappers.scikit_learn.KerasRegressor(build_fn=createModel)
#run the GridSearch
paramGrid = dict( epochs=[100, 250, 500], batch_size=[16, 32, 64] )
grid = sklearn.model_selection.GridSearchCV(estimator=model, param_grid=paramGrid, n_jobs=1, cv=5)
#obtain and print the result (X, y are some data)
grid_result = grid.fit(X, y)
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
I don't understand what exactly is the best_score_ member in the grid search result. Is it a gap between the theoritical values and the predicted values? This best_score_ is always negative (and quite big) on my examples, it doesn't make any sense to me.
When you don't pass a specific scoring
metric, GridSearchCV will use the default score
method of estimator
.
In your example, you did not pass a metric to your grid search instance, so it will use the default score metric of KerasRegressor
, which is the mean loss of predictions (according to the source code on GitHub). Hence, since you set cv=5
, grid_result.best_score_
is the average of the mean loss on all 5 folds.
I suggest you set your own performance metric by passing a value for scoring
. For example:
grid = sklearn.model_selection.GridSearchCV(estimator=model, param_grid=paramGrid,
scoring='roc_auc', n_jobs=1, cv=5)
You can find a list of all the supported metrics here. You can also define your own.