Search code examples
pythonnumpytensorflowkerasdataformat

Format of several x inputs for training multi input functional keras model


So I am currently trying to understand what formats a multi input keras model expect and don´t understand how to feed in several ones.

from tensorflow.keras.models import Model
import tensorflow.keras
import tensorflow as tf

first_input = Input(2)
second_input = Input(2)
concat_layer= Concatenate()([first_input, second_input ])
hidden= Dense(2, activation="relu")(concat_layer)
output = Dense(1, activation="sigmoid")(hidden)
model = Model(inputs=[first_input, second_input], outputs=output)
model.summary()
model.compile(loss='mean_squared_error', metrics=['mean_squared_error'], optimizer='adam')

# I managed to get the format for prediction and single training data correct
# this works
inp = [np.array([[0,2]]), np.array([[0,2]])]
model.predict(inp)
model.fit(inp,np.array([42]), epochs=3, )

# I don´t get why this isn´t working
# this doesn´t work
model.fit(np.array([inp,inp]),np.array([42, 43]), epochs=3, )´

Having read the keras doc of the fit function I really don´t understand why my version isn´t working:

x : Vector, matrix, or array of training data (or list if the model has multiple inputs). If all inputs in the model are named, you can also pass a list mapping input names to data. x can be NULL (default) if feeding from framework-native tensors (e.g. TensorFlow data tensors).

Because I am literally giving it an array of lists.

The last code line results in following error:

ValueError: Layer model expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(None, 2, 1, 2) dtype=int64>]

Any help appreciated.


Solution

  • When you create the model

    model = Model(inputs=[first_input, second_input], outputs=output)
    

    This means that the inputs are expected to be a list of 2 tensors with shapes (2,) and one output with shape (1,) (as defined by the last Dense layer).

    So, when you use as arguments:

    inp = [np.array([[0,2]]), np.array([[0,2]])]
    

    This is a list with 2 arrays of shape (1, 2)

    And

    np.array([42])
    

    which is an array with shape (1)

    This matches your model definition. [Actually the output should have a shape of (1, 1)]

    The line

    model.fit(np.array([inp,inp]),np.array([42, 43]))
    

    Is trying to feed an a list of lists of arrays with an overall shape of [2, 2, 1, 2] and a target with shape [2]. This doesn't match the model definition.

    Since it is easy to get this wrong, i tend to like to build the arguments to the .fit or predict call first and then explicitly print their shapes...

    e.g.

    x_train = [a, b] # where a and b are np.arrays
    print([x.shape for x in x_train])
    

    For instance try:

    x_first = np.random.rand(8, 2) # batch_size 8, feature_size 2
    x_second = np.random.rand(8, 2)
    x_train = [a, b]
    
    y_true = np.random.rand(8)
    
    model.fit(x_train, y_true)