Search code examples
tensorflowkeras

Tensorflow Keras Dot of Layers Shape Error


I am trying to create a regression model that outputs the multiplication of a scalar dense layer and a scalar custom layer. The input of the model is 2D array.

The CustomLayer only works on the first element of the input array. So my custom layer looks like this now.

class CustomLayer(tf.keras.layers.Layer):
    
    def __init__(self, *args, **kwargs):
        super().__init__( *args, **kwargs)
        self.a = tf.Variable(0.1, dtype=tf.float32, trainable=True)
        self.b = tf.Variable(0.1, dtype=tf.float32, trainable=True)
        self.rho = tf.Variable(0.1, dtype=tf.float32, trainable=True)
        self.m = tf.Variable(0, dtype=tf.float32, trainable=True)
        self.sigma = tf.Variable(0.1, dtype=tf.float32, trainable=True)

    def call(self, inputs):
        return self.a + self.b * (self.rho * (inputs[0] - self.m) + tf.math.sqrt( tf.math.square( (inputs[0] - self.m) ) + tf.math.square( self.sigma)))

and the functional API looks like this :

inputs = keras.Input(shape=(2, ), name="digits")
x1 = layers.Dense(40, activation="softplus", kernel_initializer='normal')(inputs)
x2 = layers.Dense(1,  activation="softplus")(x1)
custom = CustomLayer()(inputs)
outputs = layers.dot([custom, x2], 0) # Should I use axis = 0? # had a typo here earlier
model = keras.Model(inputs=inputs, outputs=outputs)

But the code gives an error no matter what axis I use.

Dot.build(self, input_shape)
    134     axes = self.axes
    135 if shape1[axes[0]] != shape2[axes[1]]:
--> 136     raise ValueError(
    137         "Incompatible input shapes: "
    138         f"axis values {shape1[axes[0]]} (at axis {axes[0]}) != "
    139         f"{shape2[axes[1]]} (at axis {axes[1]}). "
    140         f"Full input shapes: {shape1}, {shape2}"
    141     )

ValueError: Incompatible input shapes: axis values 2 (at axis 0) != None (at axis 0). Full input shapes: (2,), (None, 1)

When I inspect the shape of the x2 and custom layer I see this: x2: I understand the None comes from the batchsize. <KerasTensor: shape=(None, 1) dtype=float32 (created by layer 'dense_191')> custom: Why don't I have the None here? <KerasTensor: shape=(2,) dtype=float32 (created by layer 'custom_layer_3')>

Apologies if I am asking something very trivial. Thank you.


Solution

  • There are multiple things. First, your custom layer has an output shape shape=(2,) instead of shape=(None, 2). You can fix that by changing inputs[0] to inputs in

    return self.a + self.b * (self.rho * (inputs[0] - self.m) + tf.math.sqrt( tf.math.square( (inputs[0] - self.m) ) + tf.math.square( self.sigma)))
    

    Otherwise, you would discard the batch dimension and will only take the first element of the batch.
    Second, you now have shapes (None, 2) (custom layer) and (None, 1) (x2). For a dot product, both vectors have to have the same length. So either change the output of x2 to (None, 2), or change the output of your CustomLayer to (None, 1).

    Edit: If you only want to take the first scalar element in CustomLayer, you can change ìnput[0] to input[:, :1] to get shape=(None, 1) in the call function.