tensorflow machine-learning keras lstm embedding

tensorflow keras embedding lstm

I would like to use the Embedding layer before feeding my input data into the LSTM network I am attempting to create.. Here is the relevant part of the code:

input_step1 = Input(shape=(SEQ_LENGTH_STEP1, NR_FEATURES_STEP1), 
                           name='input_step1')

step1_lstm = CuDNNLSTM(50,
                       return_sequences=True,
                       return_state = True,
                       name="step1_lstm")

out_step1, state_h_step1, state_c_step1 = step1_lstm(input_step1)

I am a bit confused regarding how I am supposed to add an Embedding layer here..

Here is the description of the Embedding layer from the documentation:

keras.layers.Embedding(input_dim, 
                       output_dim, 
                       embeddings_initializer='uniform',
                       embeddings_regularizer=None, 
                       activity_regularizer=None, 
                       embeddings_constraint=None, 
                       mask_zero=False, 
                       input_length=None)

The confusing part is that my defined Input has a sequence length and number of features defined. Writing it here again:

input_step1 = Input(shape=(SEQ_LENGTH_STEP1, NR_FEATURES_STEP1), 
                           name='input_step1')

When defining an Embedding layer, I am pretty confused about which parameters of the Embedding function corresponds to "number of sequence" and "number of features in each time step". Can anyone guide me how I can integrate an Embedding layer to my code above?

ADDENDUM:

If I try the following:

SEQ_LENGTH_STEP1  = 5 
NR_FEATURES_STEP1 = 10 

input_step1 = Input(shape=(SEQ_LENGTH_STEP1, NR_FEATURES_STEP1), 
                           name='input_step1')

emb = Embedding(input_dim=NR_FEATURES_STEP1,
                output_dim=15,
                input_length=NR_FEATURES_STEP1)

input_step1_emb = emb(input_step1)

step1_lstm = CuDNNLSTM(50,
                       return_sequences=True,
                       return_state = True,
                       name="step1_lstm")

out_step1, state_h_step1, state_c_step1 = step1_lstm(input_step1_emb)

I get the following error:

ValueError: Input 0 of layer step1_lstm is incompatible with the layer:
expected ndim=3, found ndim=4. Full shape received: [None, 5, 10, 15]

I am obviously not doing the right thing.. Is there a way to integrate Embedding into the LSTM network I am trying to attempt?

Solution

From the Keras Embedding documentation:

Arguments

input_dim: int > 0. Size of the vocabulary, i.e. maximum integer index + 1.

output_dim: int >= 0. Dimension of the dense embedding.

input_length: Length of input sequences, when it is constant. This argument is required if you are going to connect Flatten then Dense layers upstream (without it, the shape of the dense outputs cannot be computed).

Therefore, from your description, I assume that:

input_dim corresponds to the vocabulary size (number of distinct words) of your dataset. For example, the vocabulary size of the following dataset is 5:
```
data = ["Come back Peter,",
        "Come back Paul"]
```
output_dim is an arbitrary hyperparameter that indicates the dimension of your embedding space. In other words, if you set output_dim=x, each word in the sentence will be characterized with x features.
input_length should be set to SEQ_LENGTH_STEP1 (an integer indicating the length of each sentence), assuming that all the sentences have the same length.

The output shape of an embedding layer is (batch_size, input_length, output_dim).

Further notes regarding the addendum:

team_in_step1 is undefined.
Assuming that your first layer is an Embedding layer, the expected shape of the input tensor input_step1 is (batch_size, input_length):
```
input_step1 = Input(shape=(SEQ_LENGTH_STEP1,), 
                           name='input_step1')
```
Each integer in this tensor corresponds to a word.
As mentioned above, the embedding layer could be instantiated as follows:
```
emb = Embedding(input_dim=VOCAB_SIZE,
                output_dim=15,
                input_length=SEQ_LENGTH_STEP1)
```
where VOCAB_SIZE is the size of your vocabulary.
This answer contains a reproducible example that you might find useful.