Search code examples
tensorflowneural-networksoftmax

softmax activation function interpretation


I have a neural network in tensorflow that that has three hidden layers and the output layer has two neurons and it is represented by a one hot encoded value (possible output 0 or 1 so [1, 0] and [0, 1]). The input layer is formed by 60 neurons, the activations within the hidden layers is reLU, I use AdamOptimizer with a learning rate of 0.001. I have a problem when I try to compute the result of the network model:

prediction - the variable that represents the network's output

prediction_run = sess.run(prediction, feed_dict={x: mydata.reshape(1, 60)})
print("Original class: ", [1, 0], "Predicted values: ", prediction_run)

This will output this: Original class: [ 1. 0.] Predicted values: [[ 1.00000000e+00 3.35827508e-08]]

Since Im using the softmax in the final layer, isn't this supposed to be an output that will sum up to 1? Like a probability or something. Im not able to understand those predicted numbers since the softmax is supposed to transform them but they are not.

self.tf.nn.softmax(self.tf.matmul(last_hidden_layer_activation, `output_layer_weights) + output_layer_biases)

Any thoughts?


Solution

  • You are right. Softmax output is supposed to sum to 1.

    The issue is that of the floating point numbers. There is no such thing as absolute zero in case of floating point numbers. There is always a bit of uncertainity in floating point. More information here