Search code examples
kerasgoogle-cloud-mlhyperparameters

Hyperparameter metric in Google Cloud ML should contain the `val` prefix?


When defining the hyperparameter metric for Google Cloud ML I can use mean_squared_error, but should I be using val_mean_squared_error instead if I want it to be comparing the validation set accuracy? Or does it do it on its own?

This is the sample hptuning config:

trainingInput:
  ...
  hyperparameters:
    goal: MINIMIZE
    hyperparameterMetricTag: ???mean_squared_error

And this is the fit invocation:

history = m.fit(train_x, train_y, epochs=epochs, batch_size=2048,
                         shuffle=False,
                         validation_data=(val_x, val_y),
                         verbose=verbose,
                         callbacks=callbacks)

Since I am passing my validation data and Keras, I am in doubt whether I should use val_mean_squared_error.


Solution

  • The answer is: if you (I) want Google Cloud ML hyperparameter tuning to use the VALIDATION metric instead of the training metric while using Keras, you need to specify val_mean_squared_error (or val_accuracy etc).

    If you stick to accuracy or mean_squared_error you will bias Google Cloud ML tuning process to select overfitting models. To avoid the overfitting while searching for the parameters you should either create your own metric (as mentioned in a comment) or use the fit method with a validation set and us the val metrics.


    I've updated the question to explicitly say I am using Keras, which automatically creates the val_mean_squared_error.

    To get the answer I realized I could do a simple test: both with val_mean_squared_error and mean_squared_error while using Keras and invoking fit with the validation parameter set and compare the job results with the reported metrics.