Issue in simple neural network

I tried below code in tensorflow with different variations for a simple regression problem. I have synthesized data as y=10.0*x. One input and one outcome variable. But tensorflow is giving me a loss of ~2000 to ~200000. I am using MSE as the loss function. Also tried relu activations as well, without any use.

  1. What should be the appropriate model for such a [simple] regression problem?
  2. What should the model be for problem like y=x^3?
def PolynomialModel():
    inp = layers.Input((1))
    l=layers.Dense(16, activation='tanh')(inp)
    l=layers.Dense(8, activation='tanh')(l)
    l=layers.Dense(4, activation='tanh')(l)
    output=layers.Dense(1, activation='tanh')(l)
    return models.Model(inp,output)


  • In fact, you don't need hidden layers but an input of the size of the polynomial degree + 1. Check the documentation of PolynomialFeatures from sklearn for more information.

    import tensorflow as tf
    from tensorflow.keras import layers, models, optimizers, activations
    def PolynomialModel(degree):
        inp = layers.Input((degree+1))
        out = layers.Dense(1)(inp)  # activation='linear'
        return models.Model(inp, out)
    # Suppose you want to fit a polynomial function of degree 3
    degree = 3
    model = PolynomialModel(degree)
    model.compile(loss='mean_squared_error',  optimizer=optimizers.Adam(0.1))
    # You need an input vector of degree+1 (here: x**0, x**1, x**2 and x**3)
    x = tf.linspace(-20, 20, 1000)
    X = tf.transpose(tf.convert_to_tensor([x**p for p in range(degree+1)]))
    y = x**3, y, epochs=200)
    model.predict([[4**0, 4**1, 4**2, 4**3]])


    Epoch 1/200
    32/32 [==============================] - 0s 974us/step - loss: 891936.5000
    Epoch 2/200
    32/32 [==============================] - 0s 945us/step - loss: 25650.7812
    Epoch 3/200
    32/32 [==============================] - 0s 897us/step - loss: 1188.4584
    Epoch 4/200
    32/32 [==============================] - 0s 1ms/step - loss: 75.3480
    Epoch 5/200
    32/32 [==============================] - 0s 2ms/step - loss: 21.2651
    Epoch 196/200
    32/32 [==============================] - 0s 1ms/step - loss: 1.2130e-11
    Epoch 197/200
    32/32 [==============================] - 0s 1ms/step - loss: 1.5810e-11
    Epoch 198/200
    32/32 [==============================] - 0s 1ms/step - loss: 1.2358e-11
    Epoch 199/200
    32/32 [==============================] - 0s 1ms/step - loss: 1.2775e-11
    Epoch 200/200
    32/32 [==============================] - 0s 1ms/step - loss: 1.4443e-11


    >>> model.predict([[4**0, 4**1, 4**2, 4**3]])
    1/1 [==============================] - 0s 63ms/step
    array([[64.]], dtype=float32)
    >>> model.predict([[3**0, 3**1, 3**2, 3**3]])
    1/1 [==============================] - 0s 51ms/step
    array([[27.000002]], dtype=float32)
    >>> model.summary()
    Model: "PolynomialRegressor"
     Layer (type)                Output Shape              Param #   
     Input (InputLayer)          [(None, 4)]               0         
     Output (Dense)              (None, 1)                 5         
    Total params: 5 (20.00 Byte)
    Trainable params: 5 (20.00 Byte)
    Non-trainable params: 0 (0.00 Byte)

    Note: x**0 is not really necessary (only ones...) but I want the same behavior than PolynomialFeatures. So you can also consider if your degree is 3, the input size is 3 (and not 4).