I'm trying to perform elementwise gradient with
e.g.,
output-f(x): 5 by 1 vector,
with respect to input-X: 5 by 1 vector
I can do this like,
import theano
import theano.tensor as T
X = T.vector('X')
f = X*3
[rfrx, []] = theano.scan(lambda j, f,X : T.grad(f[j], X), sequences=T.arange(X.shape[0]), non_sequences=[f,X])
fcn_rfrx = theano.function([X], rfrx)
fcn_rfrx(np.ones(5,).astype(float32))
and the result is
array([[ 3., 0., 0., 0., 0.],
[ 0., 3., 0., 0., 0.],
[ 0., 0., 3., 0., 0.],
[ 0., 0., 0., 3., 0.],
[ 0., 0., 0., 0., 3.]], dtype=float32)
but since it's not efficient, i want to get 5 by 1 vector as a result
by doing something like..
[rfrx, []] = theano.scan(lambda j, f,X : T.grad(f[j], X[j]), sequences=T.arange(X.shape[0]), non_sequences=[f,X])
which doesn't work.
Is there any way of do this? (sorry for bad format..I'm new here and learning)
(I added more clear example):
given input vector: x[1], x[2], ..., x[n]
and output vector: y[1], y[2], .., y[n],
where y[i] = f(x[i]).
I want the result of
df(x[i])/dx[i] only
and not the
df(x[i])/dx[j] for (i<>j)
, for computational efficiency (n is number of data > 10000)
You are looking for theano.tensor.jacobian
.
import theano
import theano.tensor as T
x = T.fvector()
p = T.as_tensor_variable([(x ** i).sum() for i in range(5)])
j = T.jacobian(p, x)
f = theano.function([x], [p, j])
Now evaluating yields
In [31]: f([1., 2., 3.])
Out[31]:
[array([ 3., 6., 14., 36., 98.], dtype=float32),
array([[ 0., 0., 0.],
[ 1., 1., 1.],
[ 2., 4., 6.],
[ 3., 12., 27.],
[ 4., 32., 108.]], dtype=float32)]
If you are interested in only one, or a few partial derivatives, you can obtain only them also. It would be necessary to take a close look at the theano optimization rules to be able to see how much more efficient this gets (a benchmark is a first test). (It is possible that already indexing into the gradient makes it clear to theano that it does not need to calculate the rest).
x = T.fscalar()
y = T.fvector()
z = T.concatenate([x.reshape((1,)), y.reshape((-1,))])
e = (z ** 2).sum()
g = T.grad(e, wrt=x)
ff = theano.function([x, y], [e, g])