python-3.x tensorflow neural-network deep-learning softmax

Why do the outputs sum to 1 in the TensorFlow Get started with Eager tutorial?

I did this tutorial: https://www.tensorflow.org/get_started/eager

It was very helpful, but I didn't understand why the outputs always sum to 1. It is stated "For this example, the sum of the output predictions are 1.0", but not explained. I thought it might be a characteristic of the activation function, but I read ReLu can take any value >0 (https://medium.com/the-theory-of-everything/understanding-activation-functions-in-neural-networks-9491262884e0).

I'd like to understand because I want to learn in which cases one should normalize the output variables and in which cases this is not necessary (I assume if they always sum up to 1, it's not necessary).

Solution

In the given example, the sentence outputs always sum to 1 refers to the used softmax function and has nothing to do with normalization or your used activation function. In the tutorial's Iris example we want to distingusih between three classes and of course the sum of the class probabilities cannot exceed 100% (1.0).

For example the softmax function, which is located at the end of your network, could return [0.8, 0.1, 0.1]. That means the first class has the highest probability. Notice: The sum of all single probas result to 1.0.