Search code examples
machine-learningkerasdeep-learningtensorflow2.0tensorflow-datasets

How to I train a keras functional API model with batched tf Dataset objects? (BatchDataset)


I am constructing a tf keras model using the functional API. This model will train fine on large memory mapped arrays. However, for numerous reasons it can be advantageous to work with tensorflow Dataset objects. Therefore, I use from_tensor_slices() to convert my arrays to Dataset objects. The problem is that the model will no longer train.

The keras docs: Model training APIs indicate that dataset objects are acceptable.

The guide I'm following on how to train is found here: Using tf.data with tf keras

Guides on how to use the keras functional API are here. However, training a functional API model with a tf Dataset object is not outlined.

A MWE is provided here:

import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras import layers

print('numpy version: {}'.format(np.__version__))
print('keras version: {}'.format(keras.__version__))
print('tensorflow version: {}'.format(tf.__version__))

numpy version: 1.21.4
keras version: 2.6.0
tensorflow version: 2.6.0

X = np.random.uniform(size=(1000,75))
Y = np.random.uniform(size=(1000))

data = tf.data.Dataset.from_tensor_slices((X, Y))
print(data.cardinality().numpy())

1000

data.batch(batch_size=100, drop_remainder=True)

<BatchDataset shapes: ((100, 75), (100,)), types: (tf.float64, tf.float64)>

def API_Model(input_shape, name="test_model"):

    inputs = layers.Input(shape=input_shape)
    x = layers.Dense(1)(inputs)
    outputs = layers.Activation('relu')(x)

    return keras.Model(inputs=inputs, outputs=outputs, name=name)

api_model = API_Model(input_shape=(X.shape[1],))
api_model.compile()
api_model.summary()
Model: "test_model"
_________________________________________________________________  
Layer (type)                 Output Shape              Param #     
=================================================================  
input_2 (InputLayer)         [(None, 75)]              0           
_________________________________________________________________  
dense_1 (Dense)              (None, 1)                 76          
_________________________________________________________________  
activation_1 (Activation)    (None, 1)                 0           
=================================================================  
Total params: 76  
Trainable params: 76  
Non-trainable params: 0  
_________________________________________________________________
api_model.fit(data, epochs=10)

Epoch 1/10 WARNING:tensorflow:Model was constructed with shape (None, 75) for input KerasTensor(type_spec=TensorSpec(shape=(None, 75), dtype=tf.float32, name='input_2'), name='input_2', description="created by layer 'input_2'"), but it was called on an input with incompatible shape (75, 1).

The error I receive is: ValueError: Input 0 of layer dense_1 is incompatible with the layer: expected axis -1 of input shape to have value 75 but received input with shape (75, 1)

In addition, the error from my actual model I'm trying to train is slightly different but seems to be malfunctioning under the same principle. It is the following:

ValueError: Input 0 is incompatible with layer pfn_base: expected shape=(None, 1086, 5), found shape=(1086, 5)

What is the proper way to train a keras functional API model on a BatchDataset object?


Solution

  • You need to assign the batched dataset to a variable and you should also use a loss function in model.compile because the default value is None and you can't learn anything with it. Here is a working example:

    import numpy as np
    import tensorflow as tf
    from tensorflow import keras
    from keras import layers
    
    print('numpy version: {}'.format(np.__version__))
    print('keras version: {}'.format(keras.__version__))
    print('tensorflow version: {}'.format(tf.__version__))
    X = np.random.uniform(size=(1000,75))
    Y = np.random.uniform(size=(1000))
    
    data = tf.data.Dataset.from_tensor_slices((X, Y))
    print(data.cardinality().numpy())
    data = data.batch(batch_size=100, drop_remainder=True)
    
    def API_Model(input_shape, name="test_model"):
    
        inputs = layers.Input(shape=input_shape)
        x = layers.Dense(1)(inputs)
        outputs = layers.Activation('relu')(x)
    
        return keras.Model(inputs=inputs, outputs=outputs, name=name)
    
    api_model = API_Model(input_shape=(X.shape[1],))
    api_model.compile(loss='mse')
    api_model.summary()
    api_model.fit(data, epochs=10)