Search code examples
pythonkerasneural-networktensorflow2.0tf.keras

Hello World Neural Network to understand basics


I am trying to build a noob neural net using tensorflow that learns to solve the equation y = 2x - 1 Training data -

xs = [0,1,2,3,4,5]
ys = [-1,1,3,5,7,9]

xs = np.array(xs, dtype=float)
ys = np.array(ys, dtype=float)

I am using 3 different models which changes only in the input_shape as below -

Model 1 works perfectly -

model1 = keras.models.Sequential()
model1.add(keras.layers.Dense(units=1, input_shape=[1]))
model1.compile(optimizer='sgd', loss='mean_squared_error')

Model 2 works perfectly as well -

model2 = keras.models.Sequential()
model2.add(keras.layers.Dense(units=1, input_shape=(1,)))
model2.compile(optimizer='sgd', loss='mean_squared_error')

Model 3 fails in model.fit()

model3 = keras.models.Sequential()
model3.add(keras.layers.Dense(units=1, input_shape=(None,1,)))
model3.compile(optimizer='sgd', loss='mean_squared_error')

Here is the error -

ValueError: in user code:

    File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1284, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1268, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1249, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1050, in train_step
        y_pred = self(x, training=True)
    File "/usr/local/lib/python3.10/dist-packages/keras/utils/traceback_utils.py", line 70, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "/usr/local/lib/python3.10/dist-packages/keras/engine/input_spec.py", line 253, in assert_input_compatibility
        raise ValueError(

    ValueError: Exception encountered when calling layer 'sequential_13' (type Sequential).
    
    Input 0 of layer "dense_13" is incompatible with the layer: expected min_ndim=2, found ndim=1. Full shape received: (None,)
    
    Call arguments received by layer 'sequential_13' (type Sequential):
      • inputs=tf.Tensor(shape=(None,), dtype=float32)
      • training=True
      • mask=None

Here is the rest of the code I am using to train and predict the model.

model.fit(xs, ys, epochs=20)
model.predict([10.0])

Why model1 works, despite not using a tuple, and why model3 fails despite specifying optional batch_size argument as null?


Solution

  • In Model 1, you define the input shape as [1], which indicates that the input data should have one feature. This works perfectly for your problem because your input data xs is one-dimensional, with a single feature per data point. Model 2 is equivalent to Model 1. Here, you use a tuple (1,) to specify the input shape, but it means the same thing as [1].

    In Model 3, you specify the input shape as (None, 1). The use of None in the input shape indicates that the model should accept input data with a variable batch size. However, the issue here is that you've used (None, 1) which suggests that each input should have a shape of (batch_size, 1) where batch_size is variable. This is not what you intend, as you want to use a single feature per input. Using (None, 1) is not the correct way to specify a single feature input shape.