Search code examples
tensorflowkerasinputshapesdimension

Understand the shape in tf.keras.Input?


I just learn tensorflow and keras. Here is a code example:

# Create a symbolic input
input = tf.keras.Input(shape=(2,), dtype=tf.float32)

# Do a calculation using is
result = 2*input + 1

# the result doesn't have a value
result

calc = tf.keras.Model(inputs=input, outputs=result)
print(calc(np.array([1,2,3,4])).numpy())
print(calc(2).numpy())

The document says shape: A shape tuple (integers), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors. Elements of this tuple can be None; None elements represent dimensions where the shape is not known.

But in the above code, two print lines both work. But for me, they are 1D dimensions and 1 scalar. So how do understand the shape?


Solution

  • The thing is tf. keras.Input produces symbolic tensor or placeholders. And it can be used with the TF operation. See in the source code:

      Note that even if eager execution is enabled,
      `Input` produces a symbolic tensor (i.e. a placeholder).
      This symbolic tensor can be used with other
      TensorFlow ops, as such:
      '''python
      x = Input(shape=(32,))
      y = tf.square(x)
      '''
    

    That's why those print lines both work.


    Now, here are some scenarios. In your code, you can set shape = [n] where n > = 0, for rank 0 and 1 which are scaler and vector respectively. But you will get an error for rank 2 or matrix if n is not equal to x.shape[1]. For example:

    import tensorflow as tf 
    import numpy as np 
    
    # Create a symbolic input
    input  = tf.keras.Input(shape=[0], dtype=tf.float32)
    result = 2*input + 1 
    calc   = tf.keras.Model(inputs=input, outputs=result)
    
    print(calc(1).numpy())                     # scaler rank 0
    print(calc(np.array([1,2,3,4])).numpy())   # vector rank 1
    print(calc(np.array([[1,2,3,4]])).numpy()) # matrix rank 2 
    
    3.0
    [3. 5. 7. 9.]
    
    ValueError: Input 0 is incompatible with layer model_2: 
    expected shape=(None, 0), found shape=(1, 4)
    

    To solve it we need to set an exact number of feature dimensions to the shape parameter, here which should be 4.

    # Create a symbolic input
    input  = tf.keras.Input(shape=[4], dtype=tf.float32)
    result = 2*input + 1 
    calc   = tf.keras.Model(inputs=input, outputs=result)
    
    print(calc(1).numpy())                     # scaler rank 0
    print(calc(np.array([1,2,3,4])).numpy())   # vector rank 1
    print(calc(np.array([[1,2,3,4]])).numpy()) # matrix rank 2 
    
    3.0
    [3. 5. 7. 9.]
    [[3. 5. 7. 9.]]
    

    Here is one interesting fact, if we build this calc model with shape = [1] with scalar or vector and followed by 2D matrix, it won't raise any complaint about 2D input because it would raise an error only if the model didn't build firstly. By invoking the model with some input, the shape of the model gets set.

    # Create a symbolic input
    input  = tf.keras.Input(shape=[1], dtype=tf.float32)
    result = 2*input + 1 
    calc   = tf.keras.Model(inputs=input, outputs=result)
    
    print(calc(1).numpy())                     # scaler rank 0
    print(calc(np.array([1,2,3,4])).numpy())   # vector rank 1
    print(calc(np.array([[1,2,3,4]])).numpy()) # matrix rank 2 
    
    3.0
    [[3.]
     [5.]
     [7.]
     [9.]]
    [[3. 5. 7. 9.]]
    

    But AFAIK, it won't possible to play around like this if you build a model with a trainable layer. In that case, you need to ensure the shape matching issue between shape and the input data. For example:

    x = tf.keras.Input(shape=[4])
    y = tf.keras.layers.Dense(10)(x)
    model = tf.keras.Model(x, y)
    
    print(model(np.array([[1,2,3,4]])).numpy()) # (1, 4)
    print(model(np.array([1,2,3,4])).numpy())   # (4,)
    
    [[ 1.4779348  -1.8168153  -0.93788755 -1.4927139  -0.23618054  2.4305463
      -1.6176091   0.6640817  -1.648994    3.5819988 ]]
    
    ValueError: Input 0 of layer dense_1 is incompatible with the layer: 
    : expected min_ndim=2, found ndim=1. Full shape received: (4,)