I am currently trying to calculate the Jacobian Matrix in my training loop using GradientTape()
and batch_jacobian
in TensorFlow 2. Sadly I only obtain None
values...
My current attempt looks like this:
for step, (batch_x, batch_y) in enumerate(train_data):
with tf.GradientTape(persistent=True) as g:
g.watch(batch_x)
g.watch(batch_y)
logits = self.retrained(batch_x, is_training=True)
loss = lstm.cross_entropy_loss(logits, batch_y)
acc = lstm.accuracy(logits, batch_y)
avg_loss += loss
avg_acc += acc
gradients = g.gradient(loss, self.retrained.trainable_variables)
J = g.batch_jacobian(logits, batch_x, experimental_use_pfor=False)
print(J.numpy())
self.optimizer.apply_gradients(zip(gradients, self.retrained.trainable_variables))
Following code uses tensorflow 2:
import tensorflow as tf
Here I create a simple neural net, and then take the partial derivates of it w.r.t. the input:
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(2,1)),
tf.keras.layers.Dense(3),
tf.keras.layers.Dense(2)])
Now I use the GradientTape to calculate Jacobian Matrix (For the inputs: x=2.0,y=3.0):
x = tf.Variable([[2.0]])
y = tf.Variable([[3.0]])
with tf.GradientTape(persistent=True) as t:
t.watch([x,y])
z = tf.concat([x,y],1)
f1 = model(z)[0][0]
f2 = model(z)[0][1]
df1_dx = t.gradient(f1, x).numpy()
df1_dy = t.gradient(f1, y).numpy()
df2_dx = t.gradient(f2, x).numpy()
df2_dy = t.gradient(f2, y).numpy()
del t
print(df1_dx,df1_dy)
print(df2_dx,df2_dy)
Having in mind that neural net's weights are initialised randomly, the Jacobian Matrix or the printed output is the following:
[[-0.832729]] [[-0.19699946]]
[[-0.5562407]] [[0.53551793]]
I have tried to explain how to calculate the Jacobian Matrix of a function(written explicitly) and a neural net in more details here.