I am using Theano to create a neural network, but when I try to return two lists of tensors at the same time in a list I get the error:
#This is the line that causes the error
#type(nabla_w) == <type 'list'>
#type(nabla_w[0]) == <class 'theano.tensor.var.TensorVariable'>
backpropagate = function(func_inputs, [nabla_w, nabla_b])
TypeError: Outputs must be theano Variable or Out instances. Received [dot.0, dot.0, dot.0, dot.0] of type <type 'list'>
What kind of Theano structure should I use to return the two tensors together in an array so I can retrieve them like this:
nabla_w, nabla_b = backpropagate(*args)
I tried using some of the things I found in the basic Tensor functionality page but none of those work. (For example I tried the stack or stacklists)
Here is the error I get using theano.tensor.stack or stacklists:
ValueError: all the input array dimensions except for the concatenation axis must match exactly
Apply node that caused the error: Join(TensorConstant{0}, Rebroadcast{0}.0, Rebroadcast{0}.0, Rebroadcast{0}.0, Rebroadcast{0}.0)
Inputs shapes: [(), (1, 10, 50), (1, 50, 100), (1, 100, 200), (1, 200, 784)]
Inputs strides: [(), (4000, 400, 8), (40000, 800, 8), (160000, 1600, 8), (1254400, 6272, 8)]
Inputs types: [TensorType(int8, scalar), TensorType(float64, 3D), TensorType(float64, 3D), TensorType(float64, 3D), TensorType(float64, 3D)]
Use the Theano flag 'exception_verbosity=high' for a debugprint of this apply node.
A little extra context to the code:
weights = [T.dmatrix('w'+str(x)) for x in range(0, len(self.weights))]
biases = [T.dmatrix('b'+str(x)) for x in range(0, len(self.biases))]
nabla_b = []
nabla_w = []
# feedforward
x = T.dmatrix('x')
y = T.dmatrix('y')
activations = []
inputs = []
activations.append(x)
for i in xrange(0, self.num_layers-1):
inputt = T.dot(weights[i], activations[i])+biases[i]
activation = 1 / (1 + T.exp(-inputt))
activations.append(activation)
inputs.append(inputt)
delta = activations[-1]-y
nabla_b.append(delta)
nabla_w.append(T.dot(delta, T.transpose(inputs[-2])))
for l in xrange(2, self.num_layers):
z = inputs[-l]
spv = (1 / (1 + T.exp(-z))*(1 - (1 / (1 + T.exp(-z)))))
delta = T.dot(T.transpose(weights[-l+1]), delta) * spv
nabla_b.append(delta)
nabla_w.append(T.dot(delta, T.transpose(activations[-l-1])))
T.set_subtensor(nabla_w[-l], T.dot(delta, T.transpose(inputs[-l-1])))
func_inputs = list(weights)
func_inputs.extend(biases)
func_inputs.append(x)
func_inputs.append(y)
backpropagate = function(func_inputs, [nabla_w, nabla_b])
This is not supported by Theano. When you call theano.function(inputs, outputs)
, outputs can be only 2 things:
1) a Theano variable 2) a list of Theano variables
(2) does not allow you to have a list in the top level list, so you should flatten the lists in the outputs. This will return more then 2 outputs.
A compossible solution to your problem is to have the inner list copied into 1 variable.
tensor_nabla_w = theano.tensor.stack(*nabla_w).
This asks that all elements in nabla_w is are the same shape. This will add an extra copy in the computation graph (so it could be a little slower).
Update 1: fix call to stack()
Update 2:
As of now, we have the added constraint that all the elements will have different shapes, so stack can not be used. If they all have the same number of dimensions and dtype, you can use typed_list, otherwise you will need to modify Theano yourself or flatten the output's lists.