In the implementation i am using, the lstm is initialized in the following way:
l_lstm = Bidirectional(LSTM(64, return_sequences=True))(embedded_sequences)
What i don't really understand and it might be because of the lack of experience in Python generally: the notation l_lstm= Bidirectional(LSTM(...))(embedded_sequences)
.
I don't get what i am passing the embedded_sequences
to? Because it is not a parameter of LSTM()
but also does not seem to be an argument for Bidirectional()
as it stands separately.
Here is the documentation for Bidirectional:
def __init__(self, layer, merge_mode='concat', weights=None, **kwargs):
if merge_mode not in ['sum', 'mul', 'ave', 'concat', None]:
raise ValueError('Invalid merge mode. '
'Merge mode should be one of '
'{"sum", "mul", "ave", "concat", None}')
self.forward_layer = copy.copy(layer)
config = layer.get_config()
config['go_backwards'] = not config['go_backwards']
self.backward_layer = layer.__class__.from_config(config)
self.forward_layer.name = 'forward_' + self.forward_layer.name
self.backward_layer.name = 'backward_' + self.backward_layer.name
self.merge_mode = merge_mode
if weights:
nw = len(weights)
self.forward_layer.initial_weights = weights[:nw // 2]
self.backward_layer.initial_weights = weights[nw // 2:]
self.stateful = layer.stateful
self.return_sequences = layer.return_sequences
self.return_state = layer.return_state
self.supports_masking = True
self._trainable = True
super(Bidirectional, self).__init__(layer, **kwargs)
self.input_spec = layer.input_spec
self._num_constants = None
Let's try to break down what is going on:
LSTM(...)
which creates an LSTM Layer. Now layers in Keras are callable which means you can use them like functions. For example lstm = LSTM(...)
and then lstm(some_input)
will call the LSTM on the given input tensor.Bidirectional(...)
wraps any RNN layer and returns you another layer that when called applies the wrapped layer in both directions. So l_lstm = Bidirectional(LSTM(...))
is a layer when called with some input will apply the LSTM
in both direction. Note: Bidirectional creates a copy of passed LSTM layer, so backwards and forwards are different LSTMs.Bidirectional(LSTM(...))(embedded_seqences)
bidirectional layer takes the input sequences, passes it to the wrapped LSTMs in both directions, collects their output and concatenates it.To understand more about layers and their callable nature, you can look at the functional API guide of the documentation.