Search code examples
pythonmachine-learningneural-networkkeraskeras-layer

Unable to merge keras models after adding Lambda layer


I have a list of 3 keras models that each have an output shape of (None, 2). I also have a common keras base model that produces their input. My goal is to combine the 4 models but to only take the first output from each of the models in the list (so the final output should have shape (None, 3). My problem occurs when I try to use a Lambda layer to extract the first output from each model.

If I omit the Lambda step and simply combine the models as follows, it creates a model that gives the correct output with shape (None, 6):

>>> sequentials = [Sequential([base_model, m]) for m in models]
>>> output = merge([s.output for s in sequentials], mode='concat')
>>> combined = Model(input=base_model.layers[0].input, output=output)
>>> combined.predict(X)
array([[  8.52127552e-01,   1.47872433e-01,   1.89960217e-13,
          1.00000000e+00,   7.56258190e-01,   2.43741751e-01]], dtype=float32)

The problem occurs when I first use a Lambda layer to extract the first value from each model:

>>> print([m.output_shape for m in models])
[(None, 2), (None, 2), (None, 2)]
>>> for m in models:
        m.add(Lambda(lambda x: x[0], output_shape=(1,)))
>>> print([m.output_shape for m in models])
[(None, 1), (None, 1), (None, 1)]
>>> sequentials = [Sequential([base_model, m]) for m in models]
>>> print([s.output_shape for s in sequentials])
[(None, 1), (None, 1), (None, 1)]
>>> output = merge([s.output for s in sequentials],
                   output_shape=(len(sequentials),), mode='concat')
>>> combined = Model(base_model.layers[0].input, output=output)
>>> print(combined.output_shape)
(None, 3)
>>> combined.predict(X)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-4f4ed3bd605d> in <module>()
----> 1 ann.combined.predict(X)

./.virtualenvs/py3/lib/python3.4/site-packages/keras/engine/training.py in predict(self, x, batch_size, verbose)
   1217         f = self.predict_function
   1218         return self._predict_loop(f, ins,
-> 1219                                   batch_size=batch_size, verbose=verbose)
   1220
   1221     def train_on_batch(self, x, y,

./.virtualenvs/py3/lib/python3.4/site-packages/keras/engine/training.py in _predict_loop(self, f, ins, batch_size, verbose)
    904
    905             for i, batch_out in enumerate(batch_outs):
--> 906                 outs[i][batch_start:batch_end] = batch_out
    907             if verbose == 1:
    908                 progbar.update(batch_end)

ValueError: could not broadcast input array from shape (6) into shape (1)

What is the right way to merge these models while only taking the single output value from each one?

Note that I can successfully use a Lambda function if I apply it after merging the models as follows:

>>> sequentials = [Sequential([base_model, m]) for m in models]
>>> output = merge([s.output for s in sequentials], mode='concat')
>>> filtered = Lambda(lambda x: x[:,::2], lambda s: (s[-1] / 2,))(output)
>>> combined = Model(input=base_model.layers[0].input, output=filtered)
>>> combined.predict(X)
array([[  1.89960217e-13,   7.56258249e-01,   8.52127552e-01]], type=float32)

But I would like to know how to apply it before the merge.


Solution

  • The problem lied with a little inconsistency in Lambda slicing. Although the shape of output doesn't take into account the batch dimension - one should remember that tensor provided to Lambda layers also have this additional dimension. This is why the following line causes an error:

    m.add(Lambda(lambda x: x[0], output_shape=(1,)))
    

    This should be changed to:

    m.add(Lambda(lambda x: x[:,:1], output_shape=(1,))) 
    

    Beware of the following way of slicing:

    m.add(Lambda(lambda x: x[:,0], output_shape=(1,)))

    as it changes dimensionality of a tensor.