I'm relatively new to Neural Networks. Atm I am trying to program a Neural Network for simple image recognition of numbers between 0 and 10. The activation function I'm aiming for is ReLU (rectified linear unit).
With the sigmoid-function it is pretty clear how you can determine a probability for a certain case in the end (because its between 0 and 1).
But as far as I understand it, with the ReLU we don't have these limitations, but can get any value as a sum of previous "neurons" in the end. So how is this commonly solved?
I hope my question is understandable. Thanks in advance for taking the time, looking at my question.
You can't use ReLU function as the output function for classification tasks because, as you mentioned, its range can't represent probability 0 to 1. That's why it is used only for regression tasks and hidden layers.
For binary classification, you have to use output function with range between 0 to 1 such as sigmoid. In your case, you would need a multidimensional extension such as softmax function.