According to Sutton's book - Reinforcement Learning: An Introduction, the update equation of Network weights is given by:
where et is the eligibility trace.
This is similar to a Gradient Descent update with an extra et.
Can this eligibility trace be included in the tf.train.GradientDescentOptimizer
in TensorFlow?
Here's a simple example of using tf.contrib.layers.scale_gradient
to do elementwise multiplication of gradients. In the forward pass it's just an identity op, and in the backward pass it multiplies gradients by its second argument.
import tensorflow as tf
with tf.Graph().as_default():
some_value = tf.constant([0.,0.,0.])
scaled = tf.contrib.layers.scale_gradient(some_value, [0.1, 0.2, 0.3])
(some_value_gradient,) = tf.gradients(tf.reduce_sum(scaled), some_value)
with tf.Session():
print(scaled.eval())
print(some_value_gradient.eval())
Prints:
[ 0. 0. 0.]
[ 0.1 0.2 0.30000001]