I'm trying to modify a program that uses the Estimator class in TensorFlow (v1.10) and I would like to access the evaluation metric results every time evaluation occurs so that I can copy the checkpoint files only when a new maximum has been achieved.
One idea I had was to create a class inheriting from SessionRunHook
, doing the work I want in the after_run
method. According to the documentation I can specify what is passed to after_run
using before_run
. However I cannot find a way to access the evaluation metrics results I want from the information passed in to before_run
.
I looked into the Estimator
code and it appears that it is writing the results to a summary file so another idea I had was to read this back in the after_run
method, but the summary api doesn't seem to provide any read operations.
Are there any other ways I can achieve what I want to do? Not using the Estimator
class is not an option as that would involve drastic changes to the code I'm working with.
Checkpoints are not the same as exporting. Checkpoints are about fault-recovery and involve saving the complete training state (weights, global step number, etc.).
In your case I would recommend exporting. The exported model will written to a directory called “exporter” and the serving input function specifies what the end-user will be expected to provide to the prediction service.
You can use the class "Best Exporter" to just export the models that are perfoming best:
https://www.tensorflow.org/api_docs/python/tf/estimator/BestExporter
This class exports the serving graph and checkpoints of the best models.
Also, it performs a model export everytime when the new model is better than any exsiting model.