I have a list of 3 keras models that each have an output shape of (None, 2)
. I also have a common keras base model that produces their input. My goal is to combine the 4 models but to only take the first output from each of the models in the list (so the final output should have shape (None, 3)
. My problem occurs when I try to use a Lambda layer to extract the first output from each model.
If I omit the Lambda step and simply combine the models as follows, it creates a model that gives the correct output with shape (None, 6)
:
>>> sequentials = [Sequential([base_model, m]) for m in models]
>>> output = merge([s.output for s in sequentials], mode='concat')
>>> combined = Model(input=base_model.layers[0].input, output=output)
>>> combined.predict(X)
array([[ 8.52127552e-01, 1.47872433e-01, 1.89960217e-13,
1.00000000e+00, 7.56258190e-01, 2.43741751e-01]], dtype=float32)
The problem occurs when I first use a Lambda layer to extract the first value from each model:
>>> print([m.output_shape for m in models])
[(None, 2), (None, 2), (None, 2)]
>>> for m in models:
m.add(Lambda(lambda x: x[0], output_shape=(1,)))
>>> print([m.output_shape for m in models])
[(None, 1), (None, 1), (None, 1)]
>>> sequentials = [Sequential([base_model, m]) for m in models]
>>> print([s.output_shape for s in sequentials])
[(None, 1), (None, 1), (None, 1)]
>>> output = merge([s.output for s in sequentials],
output_shape=(len(sequentials),), mode='concat')
>>> combined = Model(base_model.layers[0].input, output=output)
>>> print(combined.output_shape)
(None, 3)
>>> combined.predict(X)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-4f4ed3bd605d> in <module>()
----> 1 ann.combined.predict(X)
./.virtualenvs/py3/lib/python3.4/site-packages/keras/engine/training.py in predict(self, x, batch_size, verbose)
1217 f = self.predict_function
1218 return self._predict_loop(f, ins,
-> 1219 batch_size=batch_size, verbose=verbose)
1220
1221 def train_on_batch(self, x, y,
./.virtualenvs/py3/lib/python3.4/site-packages/keras/engine/training.py in _predict_loop(self, f, ins, batch_size, verbose)
904
905 for i, batch_out in enumerate(batch_outs):
--> 906 outs[i][batch_start:batch_end] = batch_out
907 if verbose == 1:
908 progbar.update(batch_end)
ValueError: could not broadcast input array from shape (6) into shape (1)
What is the right way to merge these models while only taking the single output value from each one?
Note that I can successfully use a Lambda function if I apply it after merging the models as follows:
>>> sequentials = [Sequential([base_model, m]) for m in models]
>>> output = merge([s.output for s in sequentials], mode='concat')
>>> filtered = Lambda(lambda x: x[:,::2], lambda s: (s[-1] / 2,))(output)
>>> combined = Model(input=base_model.layers[0].input, output=filtered)
>>> combined.predict(X)
array([[ 1.89960217e-13, 7.56258249e-01, 8.52127552e-01]], type=float32)
But I would like to know how to apply it before the merge.
The problem lied with a little inconsistency in Lambda
slicing. Although the shape
of output doesn't take into account the batch
dimension - one should remember that tensor
provided to Lambda
layers also have this additional dimension. This is why the following line causes an error:
m.add(Lambda(lambda x: x[0], output_shape=(1,)))
This should be changed to:
m.add(Lambda(lambda x: x[:,:1], output_shape=(1,)))
Beware of the following way of slicing:
m.add(Lambda(lambda x: x[:,0], output_shape=(1,)))
as it changes dimensionality of a tensor
.