I'm trying to create a language model. I have logit
and target of size: [32, 312, 512]
Where:
.shape[0]
is batch_size
.shape[1]
is sequence_max_len
.shape[2]
is vocabulary size
The question is - when I pass logit
and target
to the loss function as follows:
self.loss = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(
logits=self.logit, labels=self.y))
Does it compute appropriate loss for the current batch? Or should I reshape logit
and target
to express the following shape: [32, 312*512]
?
Thanks in advance for your help!
The answer is: it's irrelevant, since tf.nn.softmax_cross_entropy_with_logits()
have dim
argument:
dim: The class dimension. Defaulted to -1 which is the last dimension.
name: A name for the operation (optional).
Also inside tf.nn.softmax_cross_entropy_with_logits()
you have this code:
# Make precise_logits and labels into matrices.
precise_logits = _flatten_outer_dims(precise_logits)
labels = _flatten_outer_dims(labels)