I'm using Theano for classification (convolutional neural networks)
Previously, I've been using the pixel values of the (flattened) image as the features of the NN.
Now, I want to add additional features.
I've been told that I can concatenate that vector of additional features to the flattened image features and then use that as input to the fully-connected layer, but I'm having trouble with that.
First of all, is that the right approach?
Here's some code snippets and my errors:
Similar to the provided example from their site with some modifications
(from the class that builds the model)
# allocate symbolic variables for the data
self.x = T.matrix('x') # the data is presented as rasterized images
self.y = T.ivector('y') # the labels are presented as 1D vector of [int] labels
self.f = T.matrix('f') # additional features
Below, variables v
and rng
are defined previously. What's important is layer2_input
:
layer2_input = self.layer1.output.flatten(2)
layer2_input = T.concatenate([layer2_input, self.f.flatten(2)])
self.layer2 = HiddenLayer(rng, input=layer2_input, n_in=v, n_out=200, activation=T.tanh)
(from the class that trains)
train_model = theano.function([index], cost, updates=updates,
givens={
model.x: train_set_x[index * batch_size: (index + 1) * batch_size],
model.y: train_set_y[index * batch_size: (index + 1) * batch_size],
model.f: train_set_f[index * batch_size: (index + 1) * batch_size]
})
However, I get an error when the train_model is called:
ValueError: GpuJoin: Wrong inputs for input 1 related to inputs 0.!
Apply node that caused the error: GpuJoin(TensorConstant{0}, GpuElemwise{tanh,no_inplace}.0, GpuFlatten{2}.0)
Inputs shapes: [(), (5, 11776), (5, 2)]
Inputs strides: [(), (11776, 1), (2, 1)]
Inputs types: [TensorType(int8, scalar), CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)]
Do the input shapes represent the shapes of x
, y
and f
, respectively?
If so, the third seems correct (batchsize=5, 2 extra features), but why is the first a scalar and the second a matrix?
More details:
train_set_x.shape = (61, 19200) [61 flattened images (160x120), 19200 pixels]
train_set_y.shape = (61,) [61 integer labels]
train_set_f.shape = (61,2) [2 additional features per image]
batch_size = 5
Do I have the right idea or is there a better way of accomplishing this? Any insights into why I'm getting an error?
Issue was that I was concatenating on the wrong axis.
layer2_input = T.concatenate([layer2_input, self.f.flatten(2)])
should have been
layer2_input = T.concatenate([layer2_input, self.f.flatten(2)], axis=1)