Search code examples
pythontensorflowtensorboardhyperparameters

Bayesian Optimisation via HParams and Tensorboard


I'm currently using HParams to instigate a grid search hyperparameter optimisation session, which works fine, and is outputting logs to my tensorboard HParams plugin, and I can see the various different runs and the Parallel Co-Ordinates view. The code is structured like so, although it might not be necessary to review it for this question:

def hparam_wrap(args, n_classes, train_dataset, val_dataset, tokenizer):
    log_date_subfolder = time.strftime("%Y%m%d-%H%M%S")
    hparams_dict={
        'HP_EMBEDDING_NODES': hp.HParam('embedding_nodes', hp.Discrete([200,300])),
        'HP_LSTM_NODES': hp.HParam('lstm_nodes', hp.Discrete([200,300])),
        'HP_TIMEDIST_NODES': hp.HParam('timedist_nodes', hp.Discrete([200,300])),
        'HP_NUM_DENSE_LAYERS': hp.HParam('num_dense_layers', hp.Discrete([3,4, 5])),
        'HP_DENSE_NODES': hp.HParam('dense_nodes', hp.Discrete([300,400, 500])),
        'HP_LEARNING_RATE': hp.HParam('learning_rate', hp.Discrete([0.001, 0.0001, 0.00001])),
        'HP_DROPOUT': hp.HParam('dropout', hp.Discrete([0.3, 0.4,0.5, 0.6])),
        'HP_BATCH_SIZE': hp.HParam('batch_size', hp.Discrete([96]))
    }
    session_num = 0
    for en in hparams_dict['HP_EMBEDDING_NODES'].domain.values:
        for ln in hparams_dict['HP_LSTM_NODES'].domain.values:
            for td in hparams_dict['HP_TIMEDIST_NODES'].domain.values:
                for dl in hparams_dict['HP_NUM_DENSE_LAYERS'].domain.values:
                    for dn in hparams_dict['HP_DENSE_NODES'].domain.values:
                        for lr in hparams_dict['HP_LEARNING_RATE'].domain.values:
                            for do in hparams_dict['HP_DROPOUT'].domain.values:
                                for bs in hparams_dict['HP_BATCH_SIZE'].domain.values:
                                    hparams ={
                                        'HP_NUM_DENSE_LAYERS': dl,
                                        'HP_LEARNING_RATE': lr,
                                        'HP_DROPOUT': do,
                                        'HP_DENSE_NODES': dn,
                                        'HP_BATCH_SIZE': bs,
                                        'HP_EMBEDDING_NODES': en,
                                        'HP_LSTM_NODES': ln,
                                        'HP_TIMEDIST_NODES': td
                                    }
                                    run_name = "run-%d" % session_num
                                    print('--- Starting trial: %s' % run_name)
                                    print({h: hparams[h] for h in hparams})

                                    log_dir = os.path.join('s3://sn-classification', args.type, 'Logs', args.country,
                                                           args.subfolder, 'HParams', log_date_subfolder)

                                    run_hparam(log_dir, hparams, hparams_dict, args, n_classes, train_dataset,
                                               val_dataset, tokenizer)
                                    session_num += 1


def run_hparam(log_dir, hparams, hparams_dict, args, n_classes, train_dataset, val_dataset, tokenizer):
  with tf.summary.create_file_writer(log_dir).as_default():
    hp.hparams_config(
    hparams=list(hparams_dict.values()),
    metrics=[hp.Metric('val_top_k_categorical_accuracy', display_name='TopK_Val_Accuracy'),hp.Metric('val_loss', display_name='val_loss')]
  )
    # hp.hparams(hparams)  # record the values used in this trial
    hp.hparams({hparams_dict[h]: hparams[h] for h in hparams_dict.keys()})
    history = train(args,n_classes,hparams,train_dataset, val_dataset, tokenizer)
    tf.summary.scalar('val_top_k_categorical_accuracy', history['val_top_k_categorical_accuracy'][-1], step=1)
    tf.summary.scalar('val_loss', history['val_loss'][-1], step=1)

I've done a lot of googling, but I'm still unsure how to go about implementing a more efficient optimisation session, such as Bayesian Optimisation in order to find the optimum model in a faster way. All I want to know is - is it possible to do Bayesian Optimisation within HParams, or do I need to use a different package like Weights and Biases? If it's possible, any advice on where to find an example of such an implementation would be very helpful.


Solution

  • This is a long-time open feature request and is unfortunately still not currently implemented with the HPARAMS section but Keras-tuner will allow you to log the results of each run. Encoding the hyperparameter values into these directory names could be a quick and dirty workaround. For the benefit of future readers I have provided a guide for using TensorBoard for Bayesian optimisation at the end of this answer.

    I might add that TensorBoard visualisation is useful for using grid or random search to inform a developer's manual tuning intuitions, but since Bayesian optimisation is a self-contained black-box optimiser, you should be able to let it run without the optimisation itself being affected by the lack of visualisations -- though I agree this would still be a nice feature to have.

    To implement Bayesian optimisation in TensorFlow and log the losses for each run, I provide the following for future readers:

    First define a HyperParameters object hp.

    from kerastuner.engine.hyperparameters import HyperParameters
    hp = HyperParameters()
    

    Write a model_builder function with argument hp, incorporating the hyperparameters into the model using hp.get('name'). Define a Keras-tuner BayesianOptimization tuner.

    import kerastuner as kt
    tuner = kt.BayesianOptimization(model_builder,
                                    hyperparameters = hp,
                                    max_trials      = 20,
                                    objective       = 'val_loss')
    

    Include tf.keras.callbacks.TensorBoard(cb_dir) in your callbacks to log the loss plots of each run of the BaysianOptimiser in directory cb_dir. This includes the scalar plots against epoch but not the HPARAMS section. You may wish to name these run files such that they list the hyperparameters.

    tuner.search(inputs, prices,
                 validation_split = 0.2,
                 batch_size       = 32,
                 callbacks        = [tf.keras.callbacks.TensorBoard(cb_dir)],
                 epochs           = 30)
    

    Access the dictionaries for the top n scoring hyperparameter combinations via

    ith_best_hp_dict = tuner.get_best_hyperparameters(num_trials = n)[i]