Search code examples
tensorflowkeraslstmrecurrent-neural-network

Keras TimeDistributed for multi-input case?


Description of our model

enter image description here

In our model, I would like to time distribute low_level_model to LSTM upper layer to make a hierarchical model. low_level_model finds hidden representation of customer visit by aggregating results from area sequence, and its visit_id. Each sequence of areas goes through CNN and attention layer, and the result is concatenated with the embedded vector of each visit.

As far as I know, TimeDistributed wrapper can be used to make a hierarchical model, so I tried to wrap our low_level_model with two different inputs. But it seems like the library does not support multi-input cases. Here is our code.

# Get 1st input
visit_input = keras.Input((1,))
visit_emb = visit_embedding_layer(visit_input)
visit_output = Reshape((-1,))(visit_emb)

# Get 2nd input - Shallow model
areas_input = keras.Input((10,))
areas_emb = area_embedding_layer(areas_input)
areas_cnn = Conv1D(filters=200, kernel_size=5,
               padding='same', activation='relu', strides=1)(areas_emb)
areas_output = simple_attention(areas_cnn, areas_cnn)

# Concat two results from 1st and 2nd input
v_a_emb_concat = Concatenate()([visit_output, areas_output])

# Define this model as low_level_model
low_level_model = keras.Model(inputs=[areas_input, visit_input], outputs=v_a_emb_concat)

# Would like to use the result of this low_level_model as inputs for higher-level LSTM layer.
# Therefore, wrap this model by TimeDistributed layer
encoder = TimeDistributed(low_level_model)

# New input with step-size 5 (Consider 5 as the number of previous data)
all_visit_input = keras.Input((5, 1))
all_areas_input = keras.Input((5, 10))

# This part raises AssertionError (assert len(input_shape) >= 3)
all_areas_rslt = encoder(inputs=[all_visit_input, all_areas_input])
all_areas_lstm = LSTM(64, return_sequences=False)(all_areas_rslt)
logits = Dense(365, activation='softmax')(all_areas_lstm)

# Model define (Multi-input ISSUE HERE!)
self.model = keras.Model(inputs=[all_visit_input, all_areas_input], outputs=logits)

self.model.compile(optimizer=keras.optimizers.Adam(0.001),
                   loss=custom_loss_function)

# Get data
self.train_data = data.train_data_generator_hist()
self.test_data = data.test_data_generator_hist()

# Fit
self.history = self.model.fit_generator(
generator=self.train_data,
steps_per_epoch=train_data_size//FLAGS.batch_size,
epochs=FLAGS.train_epochs]
)

Error Message

The error message is as follows.

File "/home/dmlab/sundong/revisit/survival-revisit-code/survrev.py", line 163, in train_test
all_areas_rslt = encoder(inputs=[all_visit_input, all_areas_input])
File "/home/dmlab/ksedm1/anaconda3/envs/py36/lib/python3.6/site-packages/keras/engine/base_layer.py", line 431, in __call__
self.build(unpack_singleton(input_shapes))
File "/home/dmlab/ksedm1/anaconda3/envs/py36/lib/python3.6/site-packages/keras/layers/wrappers.py", line 195, in build
assert len(input_shape) >= 3
AssertionError

What I've tried

1) I read this keras issue but could not clearly figure out how to make the trick to forward several inputs.

2) I checked that the code with TimeDistribute works when I only use single input (e.g., areas_input). The revised code sample is as follows.

3) Trying to follow the [previous question] now. (Keras TimeDistributed layer with multiple inputs)

# Using only one input 
areas_input = keras.Input((10,))
areas_emb = area_embedding_layer(areas_input)
areas_cnn = Conv1D(filters=200, kernel_size=5,
           padding='same', activation='relu', strides=1)(areas_emb)
areas_output = simple_attention(areas_cnn, areas_cnn)

# Define this model as low_level_model
low_level_model = keras.Model(inputs=areas_input, outputs=areas_output)

# Would like to use the result of this low_level_model as inputs for higher-level LSTM layer.
# Therefore, wrap this model by TimeDistributed layer
encoder = TimeDistributed(low_level_model)

# New input with step-size 5 (Consider 5 as the number of previous data)
all_areas_input = keras.Input((5, 10))

# No Error
all_areas_rslt = encoder(inputs=all_areas_input)
all_areas_lstm = LSTM(64, return_sequences=False)(all_areas_rslt)
logits = Dense(365, activation='softmax')(all_areas_lstm)

# Model define (Multi-input ISSUE HERE!)
self.model = keras.Model(inputs=all_areas_input, outputs=logits)

self.model.compile(optimizer=keras.optimizers.Adam(0.001),
               loss=custom_loss_function)

# Get data
self.train_data = data.train_data_generator_hist()
self.test_data = data.test_data_generator_hist()

# Fit
self.history = self.model.fit_generator(
generator=self.train_data,
steps_per_epoch=train_data_size//FLAGS.batch_size,
epochs=FLAGS.train_epochs]
)

Thanks in advance for sharing your techniques to workaround this issue.


Solution

  • In conclusion, I solved this issue by getting inputs altogether, and divide those input using Lambda layer. TimeDistributed can only accept single inputs, that's why. Here are the my code snippets.

    single_input = keras.Input((1+10),))
    visit_input = Lambda(lambda x: x[:, 0:1])(single_input)
    areas_input = Lambda(lambda x: x[:, 1: ])(single_input)
    ...
    low_level_model = keras.Model(inputs=single_input, outputs=concat)
    
    encoder = TimeDistributed(low_level_model)
    multiple_inputs = keras.Input((5, 11)))
    all_areas_rslt = encoder(inputs=multiple_inputs)
    all_areas_lstm = LSTM(64, return_sequences=False)(all_areas_rslt)
    logits = Dense(365, activation='softmax')(all_areas_lstm)