Search code examples

What does from_logits = True or False mean in sparse_categorical_crossentropy of Tensorflow?

In Tensorflow 2.0, there is a loss function called

tf.keras.losses.sparse_categorical_crossentropy(labels, targets, from_logits = False)

What are the differences between setting from_logits = True or False?

My guess was that when incoming values are logits, you set from_logits = True, and if incoming values are probabilities(output by softmax etc.) then you just set from_logits = False (which is a default setting).

But why? loss is just some calculation. Why does it need to differ by its incoming values? I also saw in google's tensorflow tutorial that it doesnt set from_logits = True even if incoming values of the last layer are logits. Here is the code

def train_step(inp, target):
  with tf.GradientTape() as tape:
    predictions = model(inp)
    loss = tf.reduce_mean(
        tf.keras.losses.sparse_categorical_crossentropy(target, predictions))
  grads = tape.gradient(loss, model.trainable_variables)
  optimizer.apply_gradients(zip(grads, model.trainable_variables))

  return loss

where the model is

 model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, 
                              batch_input_shape=[batch_size, None]),

which does not have the last layer of softmax. (Also, in another part of the tutorial, it set from_logits = True)

So, doesn't it matter whether I set it True or not?


  • The post Deepak mentioned has some math background.

    But for simplicity, from_logits=True means the input to crossEntropy layer is normal tensor/logits, while if from_logits=False, means the input is a probability and usually you should have some softmax activation in your last layer.