I am trying to build a noob neural net using tensorflow that learns to solve the equation y = 2x - 1
Training data -
xs = [0,1,2,3,4,5]
ys = [-1,1,3,5,7,9]
xs = np.array(xs, dtype=float)
ys = np.array(ys, dtype=float)
I am using 3 different models which changes only in the input_shape as below -
Model 1 works perfectly -
model1 = keras.models.Sequential()
model1.add(keras.layers.Dense(units=1, input_shape=[1]))
model1.compile(optimizer='sgd', loss='mean_squared_error')
Model 2 works perfectly as well -
model2 = keras.models.Sequential()
model2.add(keras.layers.Dense(units=1, input_shape=(1,)))
model2.compile(optimizer='sgd', loss='mean_squared_error')
Model 3 fails in model.fit()
model3 = keras.models.Sequential()
model3.add(keras.layers.Dense(units=1, input_shape=(None,1,)))
model3.compile(optimizer='sgd', loss='mean_squared_error')
Here is the error -
ValueError: in user code:
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1284, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1268, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1249, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.10/dist-packages/keras/engine/training.py", line 1050, in train_step
y_pred = self(x, training=True)
File "/usr/local/lib/python3.10/dist-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/usr/local/lib/python3.10/dist-packages/keras/engine/input_spec.py", line 253, in assert_input_compatibility
raise ValueError(
ValueError: Exception encountered when calling layer 'sequential_13' (type Sequential).
Input 0 of layer "dense_13" is incompatible with the layer: expected min_ndim=2, found ndim=1. Full shape received: (None,)
Call arguments received by layer 'sequential_13' (type Sequential):
• inputs=tf.Tensor(shape=(None,), dtype=float32)
• training=True
• mask=None
Here is the rest of the code I am using to train and predict the model.
model.fit(xs, ys, epochs=20)
model.predict([10.0])
Why model1
works, despite not using a tuple
, and why model3
fails despite specifying optional batch_size
argument as null
?
In Model 1, you define the input shape as [1], which indicates that the input data should have one feature. This works perfectly for your problem because your input data xs is one-dimensional, with a single feature per data point. Model 2 is equivalent to Model 1. Here, you use a tuple (1,) to specify the input shape, but it means the same thing as [1].
In Model 3, you specify the input shape as (None, 1). The use of None in the input shape indicates that the model should accept input data with a variable batch size. However, the issue here is that you've used (None, 1) which suggests that each input should have a shape of (batch_size, 1) where batch_size is variable. This is not what you intend, as you want to use a single feature per input. Using (None, 1) is not the correct way to specify a single feature input shape.