When I run my code, I get a value error with the following message:
ValueError: Input dimension mis-match. (input[0].shape[1] = 1, input[2].shape[1] = 20)
Apply node that caused the error: Elemwise{Composite{((i0 + i1) - i2)}}[(0, 0)](Dot22.0, InplaceDimShuffle{x,0}.0, InplaceDimShuffle{x,0}.0)
Toposort index: 18
Inputs types: [TensorType(float64, matrix), TensorType(float64, row), TensorType(float64, row)]
Inputs shapes: [(20, 1), (1, 1), (1, 20)]
Inputs strides: [(8, 8), (8, 8), (160, 8)]
Inputs values: ['not shown', array([[ 0.]]), 'not shown']
Outputs clients: [[Elemwise{Composite{((i0 * i1) / i2)}}(TensorConstant{(1, 1) of 2.0}, Elemwise{Composite{((i0 + i1) - i2)}}[(0, 0)].0, Elemwise{mul,no_inplace}.0), Elemwise{Sqr}[(0, 0)](Elemwise{Composite{((i0 + i1) - i2)}}[(0, 0)].0)]]
My training data is a matrix of entries such as ..
[ 815.257786 320.447 310.841]
And the batches I'm inputting to my training function have a shape of (BATCH_SIZE, 3) and type TensorType(float64, matrix)
My neural net is very simple:
self.inpt = T.dmatrix('inpt')
self.out = T.dvector('out')
self.network_in = nnet.layers.InputLayer(shape=(BATCH_SIZE, 3), input_var=self.inpt)
self.l0 = nnet.layers.DenseLayer(self.network_in, num_units=40,
nonlinearity=nnet.nonlinearities.rectify,
)
self.network = nnet.layers.DenseLayer(self.l0, num_units=1,
nonlinearity=nnet.nonlinearities.linear
)
My loss function is:
pred = nnet.layers.get_output(self.network)
loss = nnet.objectives.squared_error(pred, self.out)
loss = loss.mean()
I'm a bit confused as to why I'm getting a dimension mismatch. I'm passing in the correct input and label types (as per my symbolic variables), and the shape of my input data corresponds to the expected 'shape' parameter that I'm giving my InputLayer. I believe it's a problem with how I'm specifying the batch size, as when I use a batch size of 1 then my network can train without any problem, and the input[2].shape[1] value from the error message is my batch size. I'm quite new to machine learning, and any help would be greatly appreciated!
Turns out the problem was that my labels had the wrong dimensionality.
My data had shapes:
x_train.shape == (batch_size, 3)
y_train.shape == (batch_size,)
And the symbolic inputs to my net were:
self.inpt = T.dmatrix('inpt')
self.out = T.dvector('out')
I was able to solve my problem by reshaping y_train. I then changed the symbolic output variable to a matrix to account for these changes.
y_train = np.reshape(y_train, y_train.shape + (1,))
# y_train.shape == (batch_size, 1)
self.out = T.dmatrix('out')