I'm trying to understand the difference between why one of these implementations works, and one doesn't. I'm trying to represent some geometry in tensorflow.
First, a helper file, d_math.py
import numpy as np import tensorflow as tf
dtype = tf.float64
def skew_symmetric(vector):
#Creates a tensorflow matrix which is a skew-symmetric version of the input vector
return tf.stack([(0., -vector[2], vector[1]), (vector[2], 0., -vector[0]), (-vector[1], vector[0], 0.)], axis=0)
Here is implementation 1:
#!/usr/bin/env python3
import numpy as np
import tensorflow as tf
import d_math as d
import math
import time
class Joint():
def __init__(self, axis, pos): #TODO: right now only revolute:
axis_ = tf.Variable(axis, dtype=d.dtype)
axis_ /= tf.linalg.norm(axis)
theta_ = tf.Variable(0.0, dtype=d.dtype) #Always at the 0 angle config
self.theta_ = theta_
self.R_ = tf.cos(theta_) * tf.eye(3, dtype=d.dtype) + d.skew_symmetric(axis_) + (1. - tf.cos(theta_)) * tf.einsum('i,j->ij', axis_, axis_)
joint = Joint(np.array([1.0, 1.0, 1.0]), 0.0)
init = tf.global_variables_initializer()
with tf.Session() as session:
session.run(init)
print(joint.R_)
print(joint.R_.eval())
joint.theta_ = joint.theta_.assign(math.pi/4.)
session.run(joint.theta_)
print(joint.R_.eval())
The above version updates theta, and then I get the evaluation of two rotation matrices, one for theta = 0, and one for theta = pi/4.
I then tried to refactor my code a bit, adding a global session variable, created in a separate file, and hiding away as much about tensorflow as I could for now in the API:
version 2:
#!/usr/bin/env python3
import numpy as np
import tensorflow as tf
import d_math as d
import math
import time
import session as s
class Joint():
def __init__(self, axis, pos): #TODO: right now only revolute:
axis_ = tf.Variable(axis, dtype=d.dtype)
axis_ = axis_ / tf.linalg.norm(axis)
theta_ = tf.Variable(0.0, dtype=d.dtype) #Always at the 0 angle config
self.theta_ = theta_
self.R_ = tf.cos(theta_) * tf.eye(3, dtype=d.dtype) + d.skew_symmetric(axis_) + (1. - tf.cos(theta_)) * tf.einsum('i,j->ij', axis_, axis_)
def set_theta(self, theta):
self.theta_.assign(theta)
s.session.run(self.theta_)
joint = Joint(np.array([1.0, 1.0, 1.0]), 0.0)
init = tf.global_variables_initializer()
with s.session as session:
session.run(init)
print(joint.R_)
print(joint.R_.eval())
#joint.theta_ = joint.theta_.assign(math.pi/4.)
joint.set_theta(math.pi/4.)
print(joint.R_.eval())
session.py can be seen here:
#!/usr/bin/env python3
import tensorflow as tf
session = tf.Session()
This gives the R matrix with theta = 0 for both evaluations.
Can someone please explain to me why implementation 2 isn't working?
tf.assign
returns a reference of the updated variable. According to the docs: Returns: A Tensor that will hold the new value of 'ref' after the assignment has completed.
In the first example, you're actually using the updated reference:
joint.theta_ = joint.theta_.assign(math.pi/4.)
session.run(joint.theta_)
print(joint.R_.eval())
In the second example you're not using the updated reference:
def set_theta(self, theta):
not_used = self.theta_.assign(theta)
s.session.run(self.theta_)
My best guess is that if you use the updated reference, it should work:
def set_theta(self, theta):
self.theta_ = self.theta_.assign(theta)
s.session.run(self.theta_)
Also it would be wise to not overwrite original tensor references, so I would create a new attribute for the updated var:
def set_theta(self, theta):
self.theta_updated_ = self.theta_.assign(theta)
s.session.run(self.theta_updated_)
# ...
print(self.theta_updated_.eval()) # <<< This should give you updated value
Important: However running print(joint.R_.eval())
MAY NOT give you the updated value still because the operation self.R_
is not enforced to depend on the updated reference self.theta_updated_
and you may have to use tf.control_dependencies
to enforce execution of self.R_
operation only after the update is done. For example:
with tf.control_dependencies([self.theta_updated_]):
self.R_ = tf.cos(theta_) * # ...
Final Note: Assigning values to variables doesn't automatically tell other operations that they need to wait until this assignment is done. I discovered this the hard way. Here's a few snippets that I wrote that trace how the variables behave when tf.assign is being used. I recommend going carefully through the snippet called: Optimizing original variables that have been updated using tf.assign
. The snippets are self contained.