tensorflow neural-network keras classification softmax

When using Keras categorical_crossentropy loss, should you use softmax on the last layer?

Most examples I've seen implement softmax on the last layer. But I read that Keras categorical_crossentropy automatically applies softmax after the last layer so doing it is redundant and leads to reduced performance. Who is right?

Solution

By default, Keras categorical_crossentropy does not apply softmax to the output (see the categorical_crossentropy implementation and the Tensorflow backend call). However, if you use the backend function directly, there exists the option of setting from_logits=True.