I am trying to follow the tutorial for Language Modeling on the TensorFlow site. I see it runs and the cost goes down and it is working great, but I do not see any way to actually get the predictions from the model. I tried following the instructions at this answer but the tensors returned from session.run are floating point values like 0.017842259, and the dictionary maps words to integers so that does not work.
How can I get the predicted word from a tensorflow model?
Edit: I found this explanation after searching around, I just am not sure what x and y would be in the context of this example. They don't seem to use the same conventions for this example as they do in the explanation.
The tensor you are mentioning is the loss
, which defines how the network is training. For prediction, you need to access the tensor probabilities
which contain the probabilities for the next word. If this was classification problem, you'd just do argmax
to get the top probability. But, to also give lower probability words a chance of being generated,some kind of sampling is often used.
Edit: I assume the code you used is this. In that case, if you look at line 148 (logits
) which can be converted into probabilities by simply applying the softmax
function to it -- like shown in the pseudocode in tensorflow website. Hope this helps.