Search code examples
pythonmachine-learningneural-networktheanoconv-neural-network

Adding additional features in Theano (CNN)


I'm using Theano for classification (convolutional neural networks)

Previously, I've been using the pixel values of the (flattened) image as the features of the NN. Now, I want to add additional features.
I've been told that I can concatenate that vector of additional features to the flattened image features and then use that as input to the fully-connected layer, but I'm having trouble with that.

First of all, is that the right approach?

Here's some code snippets and my errors:
Similar to the provided example from their site with some modifications

(from the class that builds the model)

 # allocate symbolic variables for the data
 self.x = T.matrix('x')   # the data is presented as rasterized images
 self.y = T.ivector('y')  # the labels are presented as 1D vector of [int] labels
 self.f = T.matrix('f') # additional features

Below, variables v and rng are defined previously. What's important is layer2_input:

layer2_input = self.layer1.output.flatten(2)
layer2_input = T.concatenate([layer2_input, self.f.flatten(2)])
self.layer2 = HiddenLayer(rng, input=layer2_input, n_in=v, n_out=200, activation=T.tanh)

(from the class that trains)

train_model = theano.function([index], cost, updates=updates,
          givens={
             model.x: train_set_x[index * batch_size: (index + 1) * batch_size],
             model.y: train_set_y[index * batch_size: (index + 1) * batch_size],
             model.f: train_set_f[index * batch_size: (index + 1) * batch_size]
          })

However, I get an error when the train_model is called:

ValueError: GpuJoin: Wrong inputs for input 1 related to inputs 0.!
Apply node that caused the error: GpuJoin(TensorConstant{0}, GpuElemwise{tanh,no_inplace}.0, GpuFlatten{2}.0)
Inputs shapes: [(), (5, 11776), (5, 2)]
Inputs strides: [(), (11776, 1), (2, 1)]
Inputs types: [TensorType(int8, scalar), CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)]

Do the input shapes represent the shapes of x, y and f, respectively?

If so, the third seems correct (batchsize=5, 2 extra features), but why is the first a scalar and the second a matrix?

More details:

train_set_x.shape = (61, 19200) [61 flattened images (160x120), 19200 pixels]
train_set_y.shape = (61,) [61 integer labels]
train_set_f.shape = (61,2) [2 additional features per image]
batch_size = 5

Do I have the right idea or is there a better way of accomplishing this? Any insights into why I'm getting an error?


Solution

  • Issue was that I was concatenating on the wrong axis.

    layer2_input = T.concatenate([layer2_input, self.f.flatten(2)])
    

    should have been

    layer2_input = T.concatenate([layer2_input, self.f.flatten(2)], axis=1)