So I was reading the tensorflow getstarted tutorial and I found it very hard to follow. There were a lot of explanations missing about each function and why they are necesary (or not).
In the tf.estimator section, what's the meaning or what are they supposed to be the "x_eval" and "y_eval" arrays? The x_train and y_train arrays give the desired output (which is the corresponding y coordinate) for a given x coordinate. But the x_eval and y_eval values are incorrect: for x=5, y should be -4, not -4.1. Where do those values come from? What do x_eval and y_eval mean? Are they necesary? How did they choose those values?
The difference between "input_fn" (what does "fn" even mean?) and "train_input_fn". I see that the only difference is one has
num_epochs=None, shuffle=True
num_epochs=1000, shuffle=False
but I don't understand what "input_fn" or "train_input_fn" are/do, or what's the difference between the two, or if both are necesary.
3.In the
estimator.train(input_fn=input_fn, steps=1000)
piece of code, I don't understand the difference between "steps" and "num_epochs". What's the meaning of each one? Can you have num_epochs=1000 and steps=1000 too?
These are just some of the questions that bugged me while reading the "getStarted" tutorial. I personally think it leaves a lot to desire, since it's very unclear what each thing does and you can at best guess.
I agree with you that the tf.estimator
is not very well introduced in this "getting started" tutorial. I also think that some machine learning background would help with understanding what happens in the tutorial.
As for the answers to your questions:
In machine learning, we usually minimizer the loss of the model on the training set, and then we evaluate the performance of the model on the evaluation set. This is because it is easy to overfit the training set and get 100% accuracy on it, so using a separate validation set makes it impossible to cheat in this way.
(x_train, y_train)
corresponds to the training set, where the global minimum is obtained for W=-1, b=1
.(x_eval, y_eval)
doesn't have to perfectly follow the distribution of the training set. Although we can get a loss of 0
on the training set, we obtain a small loss on the validation set because we don't have exactly y_eval = - x_eval + 1
input_fn
means "input function". This is to indicate that the object input_fn
is a function.
In tf.estimator
, you need to provide an input function if you want to train the estimator (estimator.train()
) or evaluate it (estimator.evaluate()
).
train_input_fn
and eval_input_fn
(the input_fn
in the tutorial is almost equivalent to train_input_fn
and is just confusing).The number of epochs is the number of times we repeat the entire dataset. For instance if we train for 10 epochs, the model will see each input 10 times.
When we train a machine learning model, we usually use mini-batches of data. For instance if we have 1,000 images, we can train on batches of 100 images. Therefore, training for 10 epochs means training on 100 batches of data.
Once the estimator is trained, you can access the list of variables through estimator.get_variable_names()
and the value of a variable through estimator.get_variable_value()
.
Usually we never need to do that, as we can for instance use the trained estimator to predict on new examples, using estimator.predict()
.
If you feel that the getting started is confusing, you can always submit a GitHub issue to tell the TensorFlow team and explain your point.