Search code examples
tensorflowdeep-learningkerastheano

Multi-dimensional regression with Keras


I want to use Keras to train a neural network for 2-dimensional regression.

My input is a single number, and my output has two numbers:

model = Sequential()
model.add(Dense(16, input_shape=(1,), kernel_initializer=initializers.constant(0.0), bias_initializer=initializers.constant(0.0)))
model.add(Activation('relu'))
model.add(Dense(16, input_shape=(1,), kernel_initializer=initializers.constant(0.0), bias_initializer=initializers.constant(0.0)))
model.add(Activation('relu'))
model.add(Dense(2, kernel_initializer=initializers.constant(0.0), bias_initializer=initializers.constant(0.0)))
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(loss='mean_squared_error', optimizer=adam)

I then created some dummy data for training:

inputs = np.zeros((10, 1), dtype=np.float32)
targets = np.zeros((10, 2), dtype=np.float32)

for i in range(10):
    inputs[i] = i / 10.0
    targets[i, 0] = 0.1
    targets[i, 1] = 0.01 * i

And finally, I trained with minibatches in a loop, whilst testing on the training data:

while True:

    loss = model.train_on_batch(inputs, targets)

    test_outputs = model.predict(inputs)

    print test_outputs

The problem is, the outputs printed out are as follows:

[0.1, 0.045]
[0.1, 0.045]
[0.1, 0.045]
.....
.....
.....

So, whilst the first dimension is correct (0.1), the second dimension is not correct. The second dimension should be [0.01, 0.02, 0.03, .....]. So in fact, the output from the network (0.45) is simply the average of what all the values in the second dimension should be.

What am I doing wrong?


Solution

  • The problem is, that you are initializing all the weights with zero. The problem is, that if all weights are the same, then all the gradients are the same. So it is as if you have a network with a single neuron on every layer. Remove that so that the default random initialization is used and it works:

    model = Sequential()
    model.add(Dense(16, input_shape=(1,)))
    model.add(Activation('relu'))
    model.add(Dense(16, input_shape=(1,)))
    model.add(Activation('relu'))
    model.add(Dense(2))
    model.compile(loss='mean_squared_error', optimizer='Adam')
    

    The result after 1000 epochs:

    Epoch 1000/1000
    10/10 [==============================] - 0s - loss: 5.2522e-08
    
    In [59]: test_outputs
    Out[59]:
    array([[ 0.09983768,  0.00040025],
           [ 0.09986718,  0.010469  ],
           [ 0.09985521,  0.02051429],
           [ 0.09984323,  0.03055958],
           [ 0.09983127,  0.04060487],
           [ 0.09995781,  0.05083206],
           [ 0.09995599,  0.06089856],
           [ 0.09995417,  0.07096504],
           [ 0.09995237,  0.08103154],
           [ 0.09995055,  0.09109804]], dtype=float32)