Search code examples
tensorflowcomputer-visiondeep-learningtheanokeras

softmax and sigmoid function for the output layer


In the deep learning implementations related to object detection and semantic segmentation, I have seen the output layers using either sigmoid or softmax. I am not very clear when to use which? It seems to me both of them can support these tasks. Are there any guidelines for this choice?


Solution

  • softmax() helps when you want a probability distribution, which sums up to 1. sigmoid is used when you want the output to be ranging from 0 to 1, but need not sum to 1.

    In your case, you wish to classify and choose between two alternatives. I would recommend using softmax() as you will get a probability distribution which you can apply cross entropy loss function on.