Search code examples
pythondeep-learningtf.kerasloss-function

why did i got 2 different losses for sparse_categorical_crossentropy and categorical_crossentropy?


I trained a model for multiclass classification. There were three classes. In the first approach, I trained a model by converting the classes into one-hot vectors and training a model with loss function, categorical crossentropy, I achieved a loss of 0.07 in 1000 epochs. But when I used the same approach, but this time I did not converted the classes into one-hot vectors and used sparse_categorical_crossentropy, as the loss function, this time i achieved a loss of 0.05 in 1000 epochs.. Does this mean that sparse_categorical_crossentropy is better than categorical_crossentropy?

Thank You!


Solution

  • You can't compare two loss functions in term of losses since the definition of loss itself changed. you can compare the performance on the same test dataset.

    In general use sparse_categorical_crossentropy when your classes are mutually exclusive (e.g. when each sample belongs exactly to one class) and categorical_crossentropy when one sample can have multiple classes or labels are soft probabilities (like [0.5, 0.3, 0.2]).

    You got different losses because the representation of the labels changes, actually in keras the sparse_categorical_crossentropy is defined as categorical crossentropy with integer targets