I'm new to theano, and I'm having troubles. I'm trying to use theano to create a neural network that can be used for a regression task (instead of a classification task) After reading a lot of Tutorials, I came to the conclusion that I could do that by creating an output layer which just handles the regression, and prepand a "normal" neural net with a few hidden layers. (But that still lies in the future).
So this is my "model":
1 #!/usr/bin/env python
2
3 import numpy as np
4 import theano
5 import theano.tensor as T
6
7 class RegressionLayer(object):
8 """Class that represents the linear regression, will be the outputlayer
9 of the Network"""
10 def __init__(self, input, n_in, learning_rate):
11 self.n_in = n_in
12 self.learning_rate = learning_rate
13 self.input = input
14
15 self.weights = theano.shared(
16 value = np.zeros((n_in, 1), dtype = theano.config.floatX),
17 name = 'weights',
18 borrow = True
19 )
20
21 self.bias = theano.shared(
22 value = 0.0,
23 name = 'bias'
24 )
25
26 self.regression = T.dot(input, self.weights) + self.bias
27 self.params = [self.weights, self.bias]
28
29 def cost_function(self, y):
30 return (y - self.regression) ** 2
31
to train the model as in the theano tutorials I tried the following:
In [5]: x = T.dmatrix('x')
In [6]: reg = r.RegressionLayer(x, 3, 0)
In [8]: y = theano.shared(value = 0.0, name = "y")
In [9]: cost = reg.cost_function(y)
In [10]: T.grad(cost=cost, wrt=reg.weights)
─────────────────────────────────────────────────────────────────────────────────────────────--------------------------------------------------------------------------- [77/1395]
TypeError Traceback (most recent call last)
<ipython-input-10-0326df05c03f> in <module>()
----> 1 T.grad(cost=cost, wrt=reg.weights)
/home/name/PythonENVs/Theano/local/lib/python2.7/site-packages/theano/gradient.pyc in grad(c
ost, wrt, consider_constant, disconnected_inputs, add_names, known_grads, return_disconnected
)
430
431 if cost is not None and cost.ndim != 0:
--> 432 raise TypeError("cost must be a scalar.")
433
434 if isinstance(wrt, set):
TypeError: cost must be a scalar.
I feel like I did exactly the same (only with the math I need) like it was done in theanos logistic regression tutorial (http://deeplearning.net/tutorial/logreg.html) but it doesn't work. So why cant I create the gradients?
Your cost function should probably be a sum of squares. At the moment it is a vector of squares, but you need to condense it down to one value in order to be able to the the gradient of the then scalar function. This is usually done like this:
def cost_function(self, y):
return ((y - self.regression) ** 2).mean()