Search code examples
pythonnumpytensorflowkerasconv-neural-network

Tensorflow Keras TypeError: Eager execution of tf.constant with unsupported shape


My goal is to do Conv2d to an array with a custom shape and custom kernel with this code:

import tensorflow as tf
import numpy as np
import sys
tf.compat.v1.enable_eager_execution()

# kernel
kernel_array = np.array([[1, 1, 1], [1, 1, 1], [1, 1, 1]])
kernel = tf.keras.initializers.Constant(kernel_array)
print('kernel shape:', kernel_array.shape)
print('kernel:',kernel_array)

# input
input_shape = (3, 3, 4, 1)
x = tf.random.normal(input_shape)
print('x shape:', x.shape)
print('x:',x.numpy())

# output
y = tf.keras.layers.Conv2D(
    3, kernel_array.shape, padding="same", strides=(1, 1),
    kernel_initializer=kernel,
    input_shape=input_shape[1:])(x)
print('y shape:', y.shape)
print('y:',y.numpy())

The above codes give me an error like this:

kernel shape: (3, 3)
kernel: [[1 1 1]
 [1 1 1]
 [1 1 1]]
x shape: (3, 3, 4, 1)
x: [[[[-0.01953345]
   [-0.7110965 ]
   [ 0.15634525]
   [ 0.1707633 ]]

  [[-0.70654714]
   [ 2.7564087 ]
   [ 0.60063267]
   [ 2.8321717 ]]

  [[ 1.4761941 ]
   [ 0.34693545]
   [ 0.85550934]
   [ 2.2514896 ]]]


 [[[ 0.82585895]
   [-0.6421492 ]
   [ 1.2688193 ]
   [-0.9054445 ]]

  [[ 1.1591591 ]
   [ 0.7465941 ]
   [ 1.2531661 ]
   [ 2.2717664 ]]

  [[-0.48740315]
   [-0.42796597]
   [ 0.4480274 ]
   [-1.1502023 ]]]


 [[[-0.7792355 ]
   [-0.801604  ]
   [ 1.6724508 ]
   [ 0.25857568]]

  [[ 0.09068593]
   [-0.4783198 ]
   [-0.02883703]
   [-2.1400564 ]]

  [[-0.5518157 ]
   [-1.4513488 ]
   [-0.07611077]
   [ 1.4752681 ]]]]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[6], line 19
     16 print('x:',x.numpy())
     18 # output
---> 19 y = tf.keras.layers.Conv2D(
     20     3, kernel_array.shape, padding="same", strides=(1, 1),
     21     kernel_initializer=kernel,
     22     input_shape=input_shape[1:])(x)
     23 print('y shape:', y.shape)
     24 print('y:',y.numpy())

File c:\Users\xxxx\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\utils\traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File c:\Users\xxxx\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\initializers\initializers.py:265, in Constant.__call__(self, shape, dtype, **kwargs)
    261 if layout:
    262     return utils.call_with_layout(
    263         tf.constant, layout, self.value, shape=shape, dtype=dtype
    264     )
--> 265 return tf.constant(self.value, dtype=_get_dtype(dtype), shape=shape)

TypeError: Eager execution of tf.constant with unsupported shape. Tensor [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]] (converted from [[1 1 1]
 [1 1 1]
 [1 1 1]]) has 9 elements, but got `shape` (3, 3, 1, 3) with 27 elements).

I have no idea where the mistake is. I have tried to change the input shape but still didn't work anymore. What did I miss?


Solution

  • From this part of your code:

    y = tf.keras.layers.Conv2D(
        3, kernel_array.shape,
    ...
    

    We see that you're trying to do a convolution with a kernel with shape (3, 3), with 3 filters. What this means is that in your output, you'll end up with 3 channels. However, we see from x.shape the input has 1 channel. So our kernel has to handle convolving across the image in 3x3 windows, going from 1 channel to 3 channels. Bringing this all together, this is why we have the shape (3, 3, 1, 3).

    In other words, this is what a kernel should look like for a 2D convolution in Tensorflow: (kernel_x, kernel_y, input_channels, output_channels).

    Maybe try thinking about it like this: if your kernel doesn't have those last 3 dimensions handling output channels, then when you convolve, how does it go from 1 channel to 3? The only logical possibility might be producing the same 3 channels using the same kernel 3 times... but that doesn't really make sense.

    So there are a few things you can do:

    1. Have a kernel initializer with the right shape (account for kernel shape, input channels, and output channels)
    # kernel
    kernel_array = np.ones((3, 3, 1, 3))
    kernel = tf.keras.initializers.Constant(kernel_array)
    
    1. Change the number of output channels
    y = tf.keras.layers.Conv2D(
        1, (3, 3), padding="same", strides=(1, 1),
        kernel_initializer=kernel,
        input_shape=input_shape[1:])(x)
    
    1. Ignore shape when initializing and change how the convolution is performed:
    # kernel
    kernel = tf.keras.initializers.Constant(1.)
    ...
    y = tf.keras.layers.Conv2D(
        3, (3, 3), padding="same", strides=(1, 1),
        kernel_initializer=kernel,
        input_shape=input_shape[1:])(x)
    

    Depending on what your end goal is and what flexibility you want with your operations, each of these solutions will provide viable results.