I want to use Tensorboard to plot the mean squared error (y-axis) for every iteration over a given time frame (x-axis), say 5 minutes.
However, i can only plot the MSE given every epoch and set a callback at 5 minutes. This does not however solve my problem.
I have tried looking at the internet for some solutions to how you can maybe set a maximum number of iterations rather than epochs when doing model.fit, but without luck. I know iterations is the number of batches needed to complete one epoch, but as I want to tune the batch_size, I prefer to use the iterations.
My code currently looks like the following:
input_size = len(train_dataset.keys())
output_size = 10
hidden_layer_size = 250
n_epochs = 3
weights_initializer = keras.initializers.GlorotUniform()
#A function that trains and validates the model and returns the MSE
def train_val_model(run_dir, hparams):
model = keras.models.Sequential([
#Layer to be used as an entry point into a Network
keras.layers.InputLayer(input_shape=[len(train_dataset.keys())]),
#Dense layer 1
keras.layers.Dense(hidden_layer_size, activation='relu',
kernel_initializer = weights_initializer,
name='Layer_1'),
#Dense layer 2
keras.layers.Dense(hidden_layer_size, activation='relu',
kernel_initializer = weights_initializer,
name='Layer_2'),
#activation function is linear since we are doing regression
keras.layers.Dense(output_size, activation='linear', name='Output_layer')
])
#Use the stochastic gradient descent optimizer but change batch_size to get BSG, SGD or MiniSGD
optimizer = tf.keras.optimizers.SGD(learning_rate=0.001, momentum=0.0,
nesterov=False)
#Compiling the model
model.compile(optimizer=optimizer,
loss='mean_squared_error', #Computes the mean of squares of errors between labels and predictions
metrics=['mean_squared_error']) #Computes the mean squared error between y_true and y_pred
# initialize TimeStopping callback
time_stopping_callback = tfa.callbacks.TimeStopping(seconds=5*60, verbose=1)
#Training the network
history = model.fit(normed_train_data, train_labels,
epochs=n_epochs,
batch_size=hparams['batch_size'],
verbose=1,
#validation_split=0.2,
callbacks=[tf.keras.callbacks.TensorBoard(run_dir + "/Keras"), time_stopping_callback])
return history
#train_val_model("logs/sample", {'batch_size': len(normed_train_data)})
train_val_model("logs/sample1", {'batch_size': 1})
%tensorboard --logdir_spec=BSG:logs/sample,SGD:logs/sample1
resulting in:
The desired output should look something like this:
The answer was actually quite simple.
tf.keras.callbacks.TensorBoard has an update_freq argument allowing you to control when to write losses and metrics to tensorboard. The standard is epoch, but you can change it to batch or an integer if you want to write to tensorboard every n batches. See the documentation for more information: https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/TensorBoard