Search code examples
tensorflowneural-networkkerasclassificationsoftmax

When using Keras categorical_crossentropy loss, should you use softmax on the last layer?


Most examples I've seen implement softmax on the last layer. But I read that Keras categorical_crossentropy automatically applies softmax after the last layer so doing it is redundant and leads to reduced performance. Who is right?


Solution

  • By default, Keras categorical_crossentropy does not apply softmax to the output (see the categorical_crossentropy implementation and the Tensorflow backend call). However, if you use the backend function directly, there exists the option of setting from_logits=True.