Search code examples
cntk

Why is CNTK using the embedding dimension for the decoder?


I have trained a good sequence to sequence model that I've tested on my local box, but now I'm trying to evaluate a lot of queries. I'm seeing this error, though:

02/08/2017 00:50:54: EXCEPTION occurred: Node 'decoderInput._' (If operation): Input dimensions [100] and [57408 x 3] are not compatible.

57408 is the vocabulary size. I'm guessing 100 is coming from the number of embedding dimensions, which is set to 100.

I'm confused why this is not working, because the fact that the input and output is "sparse" is set in the "cntkReaderInputDef."

cntkReaderInputDef = { rawInput = { alias = "S0" ; dim = $inputVocabSize$ ; format = "sparse" } ; rawLabels = { alias = "S1" ;  dim = $labelVocabSize$ ;  format = "sparse" } }

Solution

  • Posted by William Darling:

    because you are using an embedding, you need to use a modified version of the CNTK.core.bs file. In line 1515, there is currently:

    decoderFeedback = /*EmbedLabels*/ (tokens.word) # [embeddingDim x Dnew]
    

    The next line is where your error is coming from:

    delayedDecoderFeedback = Boolean.If (Loop.IsFirst (labelSentenceStartEmbeddedScattered), labelSentenceStartEmbeddedScattered, Loop.Previous (decoderFeedback))
    

    The decoderFeedback has shape [W x Dnew] but the labelSentenceStartEmbeddedScattered has shape [E] where E is the embedding dimension. In BrainScript there isn't a good way to pass in the embedding macro used in the model definition, so you need to write it out explicitly. So, change line 1515 to:

    decoderFeedback = TransposeTimes(modelAsTrained.Einput, tokens.word)
    

    that will turn your decoderFeedback representation into something compatible with the embedding shape.

    Btw, the format = sparse of the reader definition is only with respect to how you formatted your CTF input file. With the sparse format that means you have things like 7:1 meaning that there is a one-hot vector with a 1 at position 7 instead of having to write out a whole bunch of zeros (which you would have with the dense format).