Does tf.nn.softmax_cross_entropy_with_logits
account for batch size?
In my LSTM network, I feed batches of different sizes, and I would like to know whether or not I should normalize error with respect to batch size before optimizing.
In the documentation it says that the softmax_cross_entropy_with_logits returns a vector with length equal to the batch size. To get a scalar cost you can do tf.reduce_mean
on that vector. Then your loss will not be affected by the batch size.