Search code examples
pythontensorflowlstmrecurrent-neural-networktensorflow2.0

python tensorflow 2.0 build a simple LSTM network without using Keras


I'm trying to build a tensorflow LSTM network without using Keras API. The model is very simple:

  1. input of sequence of 4 word indices
  2. embedding input 100 dim word vector
  3. pass through LSTM layer
  4. dense layer with output of sequence of 4 words

Loss function is sequence loss.

I have the following code:

# input
input_placeholder = tf.placeholder(tf.int32, shape=[config.batch_size, config.num_steps], name='Input')
labels_placeholder = tf.placeholder(tf.int32, shape=[config.batch_size, config.num_steps], name='Target')

# embedding
embedding = tf.get_variable('Embedding', initializer=embedding_matrix, trainable=False)
inputs = tf.nn.embedding_lookup(embedding, input_placeholder)
inputs = [tf.squeeze(x, axis=1) for x in tf.split(inputs, config.num_steps, axis=1)]

# LSTM
initial_state = tf.zeros([config.batch_size, config.hidden_size])
lstm_cell = tf.nn.rnn_cell.LSTMCell(config.hidden_size)
output, _ = tf.keras.layers.RNN(lstm_cell, inputs, dtype=tf.float32, unroll=True)

# loss op
all_ones = tf.ones([config.batch_size, config.num_steps])
cross_entropy = tfa.seq2seq.sequence_loss(output, labels_placeholder, all_ones, vocab_size)
tf.add_to_collection('total_loss', cross_entropy)
loss = tf.add_n(tf.get_collection('total_loss'))

# projection (dense)
proj_U = tf.get_variable('Matrix', [config.hidden_size, vocab_size])
proj_b = tf.get_variable('Bias', [vocab_size])
outputs = [tf.matmul(o, proj_U) + proj_b for o in output]

The problem I have is at the LSTM part now:

# tensorflow 1.x
output, _ = tf.contrib.rnn.static_rnn(
        lstm_cell, inputs, dtype = tf.float32, 
        sequence_length = [config.num_steps]*config.batch_size)

I'm having problem converting this to tensorlow 2. In above code, I'm getting the following error:

--------------------------------------------------------------------------- TypeError Traceback (most recent call last) in ----> 1 outputs, _ = tf.keras.layers.RNN(lstm_cell, inputs, dtype=tf.float32, unroll=True)

TypeError: cannot unpack non-iterable RNN object


Solution

  • The below code should work for TensorFlow 2.X.

    import tensorflow as tf
    # input
    input_placeholder = tf.compat.v1.placeholder(tf.int32, shape=[config.batch_size, config.num_steps], name='Input')
    labels_placeholder = tf.compat.v1.placeholder(tf.int32, shape=[config.batch_size, config.num_steps], name='Target')
    
    # embedding
    embedding = tf.compat.v1.get_variable('Embedding', initializer=embedding_matrix, trainable=False)
    inputs = tf.nn.embedding_lookup(params=embedding, ids=input_placeholder)
    inputs = [tf.squeeze(x, axis=1) for x in tf.split(inputs, config.num_steps, axis=1)]
    
    # LSTM
    initial_state = tf.zeros([config.batch_size, config.hidden_size])
    lstm_cell = tf.compat.v1.nn.rnn_cell.LSTMCell(config.hidden_size)
    output, _ = tf.keras.layers.RNN(lstm_cell, inputs, dtype=tf.float32, unroll=True)
    
    # loss op
    all_ones = tf.ones([config.batch_size, config.num_steps])
    cross_entropy = tfa.seq2seq.sequence_loss(output, labels_placeholder, all_ones, vocab_size)
    tf.compat.v1.add_to_collection('total_loss', cross_entropy)
    loss = tf.add_n(tf.compat.v1.get_collection('total_loss'))
    
    # projection (dense)
    proj_U = tf.compat.v1.get_variable('Matrix', [config.hidden_size, vocab_size])
    proj_b = tf.compat.v1.get_variable('Bias', [vocab_size])
    outputs = [tf.matmul(o, proj_U) + proj_b for o in output]
    
    # tensorflow 1.x
    output, _ = tf.compat.v1.nn.static_rnn(
            lstm_cell, inputs, dtype = tf.float32, 
            sequence_length = [config.num_steps]*config.batch_size)