Search code examples
keraslstmrecurrent-neural-network

Which Loss function to choose for Sequence Classification Problem?


My Problem is as below : Input : [Sequence of Characters]

Output : [ Sequence of Characters]

Both Input and Output are BOW Representations.

E.g X=[12,3,4,5,6] ---> Y= [1,4,5,7,8]

I am planning to use Keras LSTM for above task.

What should be my Loss Function ?


Solution

  • The most standard way is to model the output distribution using softmax, the appropriate loss function is categorical cross-entropy.

    Standard categorial cross-entropy expects the targets as one-hot vectors. If you want to use the indices in Y directly, use sparse categorical cross-entropy.

    (See example two in this tutorial it seems to do exactly what you want.)