I am building a simple Sequential model in Keras (tensorflow backend). During training I want to inspect the individual training batches and model predictions. Therefore, I am trying to create a custom Callback
that saves the model predictions and targets for each training batch. However, the model is not using the current batch for prediction, but the entire training data.
How can I hand over only the current training batch to the Callback
?
And how can I access the batches and targets that the Callback
saves in self.predhis and self.targets?
My current version looks as follows:
callback_list = [prediction_history((self.x_train, self.y_train))]
self.model.fit(self.x_train, self.y_train, batch_size=self.batch_size, epochs=self.n_epochs, validation_data=(self.x_val, self.y_val), callbacks=callback_list)
class prediction_history(keras.callbacks.Callback):
def __init__(self, train_data):
self.train_data = train_data
self.predhis = []
self.targets = []
def on_batch_end(self, epoch, logs={}):
x_train, y_train = self.train_data
self.targets.append(y_train)
prediction = self.model.predict(x_train)
self.predhis.append(prediction)
tf.logging.info("Prediction shape: {}".format(prediction.shape))
tf.logging.info("Targets shape: {}".format(y_train.shape))
NOTE: this answer is outdated and only works with TF1. Check @bers's answer for a solution tested on TF2.
After model compilation, the placeholder tensor for y_true
is in model.targets
and y_pred
is in model.outputs
.
To save the values of these placeholders at each batch, you can:
on_batch_end
, and store the resulting arrays.Now step 1 is a bit involved because you'll have to add an tf.assign
op to the training function model.train_function
. Using current Keras API, this can be done by providing a fetches
argument to K.function()
when the training function is constructed.
In model._make_train_function()
, there's a line:
self.train_function = K.function(inputs,
[self.total_loss] + self.metrics_tensors,
updates=updates,
name='train_function',
**self._function_kwargs)
The fetches
argument containing the tf.assign
ops can be provided via model._function_kwargs
(only works after Keras 2.1.0).
As an example:
from keras.layers import Dense
from keras.models import Sequential
from keras.callbacks import Callback
from keras import backend as K
import tensorflow as tf
import numpy as np
class CollectOutputAndTarget(Callback):
def __init__(self):
super(CollectOutputAndTarget, self).__init__()
self.targets = [] # collect y_true batches
self.outputs = [] # collect y_pred batches
# the shape of these 2 variables will change according to batch shape
# to handle the "last batch", specify `validate_shape=False`
self.var_y_true = tf.Variable(0., validate_shape=False)
self.var_y_pred = tf.Variable(0., validate_shape=False)
def on_batch_end(self, batch, logs=None):
# evaluate the variables and save them into lists
self.targets.append(K.eval(self.var_y_true))
self.outputs.append(K.eval(self.var_y_pred))
# build a simple model
# have to compile first for model.targets and model.outputs to be prepared
model = Sequential([Dense(5, input_shape=(10,))])
model.compile(loss='mse', optimizer='adam')
# initialize the variables and the `tf.assign` ops
cbk = CollectOutputAndTarget()
fetches = [tf.assign(cbk.var_y_true, model.targets[0], validate_shape=False),
tf.assign(cbk.var_y_pred, model.outputs[0], validate_shape=False)]
model._function_kwargs = {'fetches': fetches} # use `model._function_kwargs` if using `Model` instead of `Sequential`
# fit the model and check results
X = np.random.rand(10, 10)
Y = np.random.rand(10, 5)
model.fit(X, Y, batch_size=8, callbacks=[cbk])
Unless the number of samples can be divided by the batch size, the final batch will have a different size than other batches. So K.variable()
and K.update()
can't be used in this case. You'll have to use tf.Variable(..., validate_shape=False)
and tf.assign(..., validate_shape=False)
instead.
To verify the correctness of the saved arrays, you can add one line in training.py
to print out the shuffled index array:
if shuffle == 'batch':
index_array = _batch_shuffle(index_array, batch_size)
elif shuffle:
np.random.shuffle(index_array)
print('Index array:', repr(index_array)) # Add this line
batches = _make_batches(num_train_samples, batch_size)
The shuffled index array should be printed out during fitting:
Epoch 1/1 Index array: array([8, 9, 3, 5, 4, 7, 1, 0, 6, 2]) 10/10 [==============================] - 0s 23ms/step - loss: 0.5670
And you can check if cbk.targets
is the same as Y[index_array]
:
index_array = np.array([8, 9, 3, 5, 4, 7, 1, 0, 6, 2])
print(Y[index_array])
[[ 0.75325592 0.64857277 0.1926653 0.7642865 0.38901153]
[ 0.77567689 0.13573623 0.4902501 0.42897559 0.55825652]
[ 0.33760938 0.68195038 0.12303088 0.83509441 0.20991668]
[ 0.98367778 0.61325065 0.28973401 0.28734073 0.93399794]
[ 0.26097574 0.88219054 0.87951941 0.64887846 0.41996446]
[ 0.97794604 0.91307569 0.93816428 0.2125808 0.94381495]
[ 0.74813435 0.08036688 0.38094272 0.83178364 0.16713736]
[ 0.52609421 0.39218962 0.21022047 0.58569125 0.08012982]
[ 0.61276627 0.20679494 0.24124858 0.01262245 0.0994412 ]
[ 0.6026137 0.25620512 0.7398164 0.52558182 0.09955769]]
print(cbk.targets)
[array([[ 0.7532559 , 0.64857274, 0.19266529, 0.76428652, 0.38901153],
[ 0.77567691, 0.13573623, 0.49025011, 0.42897558, 0.55825651],
[ 0.33760938, 0.68195039, 0.12303089, 0.83509439, 0.20991668],
[ 0.9836778 , 0.61325067, 0.28973401, 0.28734073, 0.93399793],
[ 0.26097575, 0.88219053, 0.8795194 , 0.64887846, 0.41996446],
[ 0.97794604, 0.91307569, 0.93816429, 0.2125808 , 0.94381493],
[ 0.74813437, 0.08036689, 0.38094273, 0.83178365, 0.16713737],
[ 0.5260942 , 0.39218962, 0.21022047, 0.58569127, 0.08012982]], dtype=float32),
array([[ 0.61276627, 0.20679495, 0.24124858, 0.01262245, 0.0994412 ],
[ 0.60261369, 0.25620511, 0.73981643, 0.52558184, 0.09955769]], dtype=float32)]
As you can see, there are two batches in cbk.targets
(one "full batch" of size 8 and the final batch of size 2), and the row order is the same as Y[index_array]
.