Search code examples

Reassigning Variables in Tensorflow and scope

I'm trying to understand the difference between why one of these implementations works, and one doesn't. I'm trying to represent some geometry in tensorflow.

First, a helper file,

!/usr/bin/env python3

import numpy as np import tensorflow as tf

dtype = tf.float64

def skew_symmetric(vector):
    #Creates a tensorflow matrix which is a skew-symmetric version of the input vector    
    return tf.stack([(0., -vector[2], vector[1]), (vector[2], 0., -vector[0]), (-vector[1], vector[0], 0.)], axis=0)

Here is implementation 1:

#!/usr/bin/env python3
import numpy as np
import tensorflow as tf
import d_math as d
import math
import time

class Joint():
    def __init__(self, axis, pos): #TODO: right now only revolute:
        axis_ = tf.Variable(axis, dtype=d.dtype)
        axis_ /= tf.linalg.norm(axis)
        theta_ = tf.Variable(0.0, dtype=d.dtype) #Always at the 0 angle config
        self.theta_ = theta_
        self.R_ = tf.cos(theta_) * tf.eye(3, dtype=d.dtype) + d.skew_symmetric(axis_) + (1. - tf.cos(theta_)) * tf.einsum('i,j->ij', axis_, axis_)

joint = Joint(np.array([1.0, 1.0, 1.0]), 0.0)
init = tf.global_variables_initializer()    

with tf.Session() as session:    
    joint.theta_ = joint.theta_.assign(math.pi/4.)

The above version updates theta, and then I get the evaluation of two rotation matrices, one for theta = 0, and one for theta = pi/4.

I then tried to refactor my code a bit, adding a global session variable, created in a separate file, and hiding away as much about tensorflow as I could for now in the API:

version 2:

#!/usr/bin/env python3
import numpy as np
import tensorflow as tf
import d_math as d
import math
import time
import session as s

class Joint():
    def __init__(self, axis, pos): #TODO: right now only revolute:
        axis_ = tf.Variable(axis, dtype=d.dtype)
        axis_ = axis_ / tf.linalg.norm(axis)
        theta_ = tf.Variable(0.0, dtype=d.dtype) #Always at the 0 angle config
        self.theta_ = theta_
        self.R_ = tf.cos(theta_) * tf.eye(3, dtype=d.dtype) + d.skew_symmetric(axis_) + (1. - tf.cos(theta_)) * tf.einsum('i,j->ij', axis_, axis_)
    def set_theta(self, theta):

joint = Joint(np.array([1.0, 1.0, 1.0]), 0.0)
init = tf.global_variables_initializer()    

with s.session as session:  
    #joint.theta_ = joint.theta_.assign(math.pi/4.)
    print(joint.R_.eval()) can be seen here:

#!/usr/bin/env python3
import tensorflow as tf

session = tf.Session()

This gives the R matrix with theta = 0 for both evaluations.

Can someone please explain to me why implementation 2 isn't working?


  • tf.assign returns a reference of the updated variable. According to the docs: Returns: A Tensor that will hold the new value of 'ref' after the assignment has completed.

    In the first example, you're actually using the updated reference:

    joint.theta_ = joint.theta_.assign(math.pi/4.)

    In the second example you're not using the updated reference:

     def set_theta(self, theta):
        not_used = self.theta_.assign(theta)

    My best guess is that if you use the updated reference, it should work:

    def set_theta(self, theta):
        self.theta_ = self.theta_.assign(theta)

    Also it would be wise to not overwrite original tensor references, so I would create a new attribute for the updated var:

    def set_theta(self, theta):
        self.theta_updated_ = self.theta_.assign(theta)
    # ...
    print(self.theta_updated_.eval())  # <<< This should give you updated value

    Important: However running print(joint.R_.eval()) MAY NOT give you the updated value still because the operation self.R_ is not enforced to depend on the updated reference self.theta_updated_ and you may have to use tf.control_dependencies to enforce execution of self.R_ operation only after the update is done. For example:

    with tf.control_dependencies([self.theta_updated_]):
        self.R_ = tf.cos(theta_) * # ...

    Final Note: Assigning values to variables doesn't automatically tell other operations that they need to wait until this assignment is done. I discovered this the hard way. Here's a few snippets that I wrote that trace how the variables behave when tf.assign is being used. I recommend going carefully through the snippet called: Optimizing original variables that have been updated using tf.assign. The snippets are self contained.