Search code examples
pythontensorflowkerastensorboardtensorflow-gradient

How to monitor gradient vanish and explosion in keras with tensorboard?


I would like to monitor the gradient changes in tensorboard with keras to decide whether gradient vanish or explosion. What should I do?


Solution

  • To visualize the training in Tensorboard, add keras.callbacks.TensorBoard callback to model.fit function. Don't forget to set write_grads=True to see the gradients there. Right after training start, you can run...

    tensorboard --logdir=/full_path_to_your_logs
    

    ... from the command line and point your browser to htttp://localhost:6006. See the example code in this question.

    To check for vanishing / exploding gradients, pay attention the gradients distribution and absolute values in the layer of interest ("Distributions" tab):

    • If the distribution is highly peaked and concentrated around 0, the gradients are probably vanishing. Here's a concrete example how it looks like in practice.
    • If the distribution is rapidly growing in absolute value with time, the gradients are exploding. Often the output values at the same layer become NaNs very quickly as well.