Search code examples
pythonkerasconcatenationreshape

Concatenate identity matrix to each vector


I want to modify my input by adding several different suffixes to the input vectors. For example, if the (single) input is [1, 5, 9, 3] I want to create three vectors (stored as matrix) like this:

[[1, 5, 9, 3, 1, 0, 0],
 [1, 5, 9, 3, 0, 1, 0],
 [1, 5, 9, 3, 0, 0, 1]]

Of course, this is just one observation so the input to the model is (None, 4) in this case. The simple way is to prepare the input data somewhere else (numpy most probably) and adjust the shape of input accordingly. That I can do but I would prefer doing it inside TensorFlow/Keras.

I have isolated the problem into this code:

import keras.backend as K
from keras import Input, Model
from keras.layers import Lambda


def build_model(dim_input: int, dim_eye: int):
    input = Input((dim_input,))
    concat = Lambda(lambda x: concat_eye(x, dim_input, dim_eye))(input)
    return Model(inputs=[input], outputs=[concat])


def concat_eye(x, dim_input, dim_eye):
    x = K.reshape(x, (-1, 1, dim_input))
    x = K.repeat_elements(x, dim_eye, axis=1)
    eye = K.expand_dims(K.eye(dim_eye), axis=0)
    eye = K.tile(eye, (-1, 1, 1))
    out = K.concatenate([x, eye], axis=2)
    return out


def main():
    import numpy as np

    n = 100
    dim_input = 20
    dim_eye = 3

    model = build_model(dim_input, dim_eye)
    model.compile(optimizer='sgd', loss='mean_squared_error')

    x_train = np.zeros((n, dim_input))
    y_train = np.zeros((n, dim_eye, dim_eye + dim_input))
    model.fit(x_train, y_train)


if __name__ == '__main__':
    main()

The problem seems to be in the -1 in shape argument in tile function. I tried to replace it with 1 and None. Each has its own error:

  • -1: error during model.fit

    tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected multiples[0] >= 0, but got -1
    
  • 1: error duting model.fit

    tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [32,3,20] vs. shape[1] = [1,3,3]
    
  • None: error during build_model:

    Failed to convert object of type <class 'tuple'> to Tensor. Contents: (None, 1, 1). Consider casting elements to a supported type.
    

Solution

  • You need to use K.shape() instead to get the symbolic shape of input tensor. That's because the batch size is None and therefore passing K.int_shape(x)[0] or None or -1 as a part of the second argument of K.tile() would not work:

    eye = K.tile(eye, (K.shape(x)[0], 1, 1))