I see many examples with either MonitoredTrainingSession
or tf.Estimator
as the training framework. However it's not clear why I would use one over the other. Both are configurable with SessionRunHooks
. Both integrate with tf.data.Dataset
iterators and can feed training/val datasets. I'm not sure what the benefits of one setup would be.
Short answer is that MonitoredTrainingSession
allows user to access Graph and Session objects, and training loop, while Estimator
hides the details of graphs and sessions from the user, and generally, makes it easier to run training, especially, with train_and_evaluate
, if you need to evaluate periodically.
MonitoredTrainingSession
different from plain tf.Session() in a way that it handles variables initialization, setting up file writers and also incorporates functionality for distributed training.
Estimator API
, on the other hand, is a high-level construct just like Keras
. It's maybe used less in the examples because it was introduced later. It also allows to distribute training/evaluation with DistibutedStrategy
, and it has several canned estimators which allow rapid prototyping.
In terms of model definition they are pretty equal, both allow to use either keras.layers
, or define completely custom model from the ground up. So, if, for whatever reason, you need to access graph construction or customize training loop, use MonitoredTrainingSession
. If you just want to define model, train it, run validation and prediction without additional complexity and boilerplate code, use Estimator