I am trying to apply an extra step -- let say a simple multiplication -- on the gradients with respect to a subset of trainable variables. Here's what have:
def do_something(tgvt):
new_tgvt = []
for gv in tgvt:
if gv[0] == None:
sh = tf.shape(gv[1])
gv0 = tf.zeros(sh)
gv0t = tf.convert_to_tensor(gv0)
new_tgvt.append((gv0t, gv[1]))
else:
new_tgvt.append((gv[0]*5, gv[1]))
return new_tgvt
optimizer = tf.train.GradientDescentOptimizer(learning_rate = 1e-5)
params = tf.trainable_variables()
pars = [params[27], params[29]]
gradients = optimizer.compute_gradients(cost,pars)
tgv = [(g,v) for (g,v) in gradients]
new_gradients = do_something(tgv)
train_op = optimizer.apply_gradients(new_gradients)
session = tf.Session()
session.run(tf.global_variables_initializer())
total_iterations = 0 # record the total iterations
for i in range(total_iterations,total_iterations + num_iterations):
x_batch, y_batch = data.train.next_batch(batch_size)
feed_dict = {X: x_batch, y_true: y_batch, keep_prob: 0.5}
result = session.run([train_op, pars], feed_dict=feed_dict)
when I print the result
, the gradients is None
:
print(result[0])
print((result[1][0]).shape)
print((result[1][1]).shape)
None
(5, 5, 1, 36)
(5, 5, 36, 64)
Any idea how to fix this?
From docs :
train_op
should return :
An Operation that applies the specified gradients.
Calling sess.run
on train_op
is expected to give None
because this Operation does not result in value rather it applies.
Why don't you check it yourself by printing old and updated value for one of the variables??