Which Loss function to choose for Sequence Classification Problem?

My Problem is as below : Input : [Sequence of Characters]

Output : [ Sequence of Characters]

Both Input and Output are BOW Representations.

E.g X=[12,3,4,5,6] ---> Y= [1,4,5,7,8]

I am planning to use Keras LSTM for above task.

What should be my Loss Function ?

Solution

The most standard way is to model the output distribution using softmax, the appropriate loss function is categorical cross-entropy.

Standard categorial cross-entropy expects the targets as one-hot vectors. If you want to use the indices in Y directly, use sparse categorical cross-entropy.

(See example two in this tutorial it seems to do exactly what you want.)

How can I close over variables in kdb/Q?
When is the EACH operator extension necessary in K besides mod/rotate?
Handling single-character strings - in a function or in its caller? ssr()
Kdb+ data fomat when writing to a file
How to convert a symbol to a string in kdb+?
Sum of each two elements using vector functions
A dictionary with a single value and multiple keys
Table transformation, table as list of dicts
Accumulator gives different result then direct function applying
Reshape [cols;table]
FK field over IPC
Protected execution, 2 cases
Enums for tables
Converge (fixed point) syntax difference in q and k
.Q.trp and bt handling
NULLs in q and in k.h
Strange view declaration behaviour
How to build a parse-tree of projections?
Could not evaluate manually created equial ~ parse tree
Select distinct for all columns from keyed table
Parallel execution: blocking receive, deferred synchronous
Multiple variable assignment in q
Select a table from the inside of external select
Select when one of filter-column may not exists
What is the meaning of `s attribute on a table?
On parallel execution - which side reports about an error?
Validate if a keyed table have unique keys
Applying dictionary to dictionary
About xkey implementation
Parse tree built on values from vars