Search code examples
artificial-intelligenceconv-neural-networkrelu

How is the range of the last layer of a Neural Network determined when using ReLU


I'm relatively new to Neural Networks. Atm I am trying to program a Neural Network for simple image recognition of numbers between 0 and 10. The activation function I'm aiming for is ReLU (rectified linear unit).

With the sigmoid-function it is pretty clear how you can determine a probability for a certain case in the end (because its between 0 and 1).

But as far as I understand it, with the ReLU we don't have these limitations, but can get any value as a sum of previous "neurons" in the end. So how is this commonly solved?

  • Do I just take the biggest of all values and say thats probability 100%?
  • Do I sum up all values and say thats the 100%?
  • Or is there another aproach I can't see atm?

I hope my question is understandable. Thanks in advance for taking the time, looking at my question.


Solution

  • You can't use ReLU function as the output function for classification tasks because, as you mentioned, its range can't represent probability 0 to 1. That's why it is used only for regression tasks and hidden layers.

    For binary classification, you have to use output function with range between 0 to 1 such as sigmoid. In your case, you would need a multidimensional extension such as softmax function.