Search code examples
machine-learningneural-networkrecurrent-neural-networklstmcntk

How to implement a sequence classification LSTM network in CNTK?


I'm working on implementation of LSTM Neural Network for sequence classification. I want to design a network with the following parameters:

  1. Input : a sequence of n one-hot-vectors.
  2. Network topology : two-layer LSTM network.
  3. Output: a probability that a sequence given belong to a class (binary-classification). I want to take into account only last output from second LSTM layer.

I need to implement that in CNTK but I struggle because its documentation is not written really well. Can someone help me with that?


Solution

  • There is a sequence classification example that follows exactly what you're looking for.

    The only difference is that it uses just a single LSTM layer. You can easily change this network to use multiple layers by changing:

    LSTM_function = LSTMP_component_with_self_stabilization(
        embedding_function.output, LSTM_dim, cell_dim)[0]
    

    to:

    num_layers = 2 # for example
    encoder_output = embedding_function.output
    for i in range(0, num_layers):
        encoder_output = LSTMP_component_with_self_stabilization(encoder_output.output, LSTM_dim, cell_dim)
    

    However, you'd be better served by using the new layers library. Then you can simply do this:

    encoder_output = Stabilizer()(input_sequence)
    for i in range(0, num_layers):
        encoder_output = Recurrence(LSTM(hidden_dim)) (encoder_output.output)
    

    Then, to get your final output that you'd put into a dense output layer, you can first do:

    final_output = sequence.last(encoder_output)
    

    and then

    z = Dense(vocab_dim) (final_output)