python tensorflow machine-learning tensorflow-estimator

What are the reasons to use MonitoredTrainingSession vs Estimator in TensorFlow

I see many examples with either MonitoredTrainingSession or tf.Estimator as the training framework. However it's not clear why I would use one over the other. Both are configurable with SessionRunHooks. Both integrate with tf.data.Dataset iterators and can feed training/val datasets. I'm not sure what the benefits of one setup would be.

Solution

Short answer is that MonitoredTrainingSession allows user to access Graph and Session objects, and training loop, while Estimator hides the details of graphs and sessions from the user, and generally, makes it easier to run training, especially, with train_and_evaluate, if you need to evaluate periodically.

MonitoredTrainingSession different from plain tf.Session() in a way that it handles variables initialization, setting up file writers and also incorporates functionality for distributed training.

Estimator API, on the other hand, is a high-level construct just like Keras. It's maybe used less in the examples because it was introduced later. It also allows to distribute training/evaluation with DistibutedStrategy, and it has several canned estimators which allow rapid prototyping.

In terms of model definition they are pretty equal, both allow to use either keras.layers, or define completely custom model from the ground up. So, if, for whatever reason, you need to access graph construction or customize training loop, use MonitoredTrainingSession. If you just want to define model, train it, run validation and prediction without additional complexity and boilerplate code, use Estimator