python tensorflow word2vec word-embedding

How does the tensorflow word2vec tutorial update embeddings?

This thread comes close: What is the purpose of weights and biases in tensorflow word2vec example?

But I am still missing something from my interpretation of this: https://github.com/tensorflow/tensorflow/blob/r1.2/tensorflow/examples/tutorials/word2vec/word2vec_basic.py

From what I understand, you feed the network the indices of target and context words from your dictionary.

_, loss_val = session.run([optimizer, loss], feed_dict=feed_dict)
average_loss += loss_val

The batch inputs are then looked up to return the vectors that are randomly generated at the beginning

    embeddings = tf.Variable(
    tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
    # Look up embeddings for inputs.
    embed = tf.nn.embedding_lookup(embeddings, train_inputs)

Then an optimizer adjusts the weights and biases to best predict the label as opposed to num_sampled random alternatives

 loss = tf.reduce_mean(
  tf.nn.nce_loss(weights=nce_weights,
                 biases=nce_biases,
                 labels=train_labels,
                 inputs=embed,
                 num_sampled=num_sampled,
                 num_classes=vocabulary_size))

  # Construct the SGD optimizer using a learning rate of 1.0.
  optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)

My questions are as follows:

Where do the embeddings variable get updated?. It appears to me that I could get the final result by either running the index of a word through the neural network, or by just taking the final_embeddings vectors and using that. But I do not understand where embeddings is ever changed from its random initialization.
If I were to draw this computation graph, what would it look like (or better yet, what is the best way to actually do so)?
Is this running all of the context/target pairs in the batch at once? Or one by one?

Solution

Embeddings: Embeddings is a variable. It gets updated every time you do backprop (while running optimizer with loss)

Grpah: Did you try saving the graph and displaying it in tensorboard ? Is this what you're looking for ?

Batching: Atleast in the example you linked, he is doing batch processing using the function at line 96. https://github.com/tensorflow/tensorflow/blob/r1.2/tensorflow/examples/tutorials/word2vec/word2vec_basic.py#L96

Please correct me if I misunderstood your question.