Assume a deep learning problem, where there exists only one object in the image, we want to classify whether the object is either
Y={Cat:1, Dog:2, Panda:3}
Can we address this problem using neural networks in two ways:
Questions are:
a) Are these two systems have the same performance?
b) Have Seen "sparse_categorical_crossentropy" in Tensorflow, does it implicitly convert labels Y={1,2,3} to Y={[1 0 0], [0 1 0], [0 0 1]} so that if I'm using "sparse_categorical_crossentropy" with labels Y={1,2,3} I should make last layer softmax layer?
The systems should not be equivalent, as different loss functions will lead to different gradients that are backpropagated during training. Thus, your learning will be different. However, the performance may be similar, but you need to try and see how similar or how different it is. Usually people use cross-entropy loss for these kinds of problems.
Regarding the sparse_categorical_crossentropy in Tensorflow, according to this page, you can either provide your input as logits (no softmax) and set from_logits=True
or you leave from_logits
to the default value (which is False) and use softmax.