Search code examples
pythontensorflowlstmgoogle-cloud-ml-engine

tensorflow rnn_decoder perform softmax on each decoder_output


I tried to write my own estimator model_fn() for a GCP ML Engine package. I decoded a sequence of outputs using embedding_rnn_decoder as shown below:

outputs, state = tf.contrib.legacy_seq2seq.embedding_rnn_decoder(
    decoder_inputs = decoder_inputs,
    initial_state = curr_layer,
    cell = tf.contrib.rnn.GRUCell(hidden_units),
    num_symbols = n_classes, 
    embedding_size = embedding_dims,
    feed_previous = False)

I know that outputs is "A list of the same length as decoder_inputs of 2D Tensors" but I am wondering how I can use this list to calculate the loss function for the entire sequence?

I know that if I grab outputs[0] (ie. grab only the first sequence output) then I could loss by following:

logits = tf.layers.dense(
    outputs[0],
    n_classes)
loss = tf.reduce_mean(
    tf.nn.sparse_softmax_cross_entropy_with_logits(
        logits=logits, labels=labels)

Is it appropriate to generate a loss value for each out the items in output and then pass these all to tf.reduce_mean? This feels inefficient, especially for long sequences -- are there any other ways to calculate the softmax at each step of the sequence that would be more efficient?


Solution

  • It looks like the solution to my problem is to use sequence_loss_by_example