Search code examples
pythonvariablestensorflowvariable-assignmenttiming

Tensorflow: When are variable assignments done in sess.run with a list?


I have thought that variable assignments are done after all operations in a list given to sess.run, but the following code returns different results at different execution. It seems randomly run operations in the list and assign the variable after the run of the operation in the list.

a = tf.Variable(0)
b = tf.Variable(1)
c = tf.Variable(1)
update_a = tf.assign(a, b + c)
update_b = tf.assign(b, c + a)
update_c = tf.assign(c, a + b)

with tf.Session() as sess:
  sess.run(initialize_all_variables)
  for i in range(5):
    a_, b_, c_ = sess.run([update_a, update_b, update_c])

I'd like to know the timing of variable assignments. Which are correct: "update_x -> assign x -> ... -> udpate_z -> assign z" or "update_x -> udpate_y -> udpate_z -> assign a, b, c"? (where (x, y, z) is a permutation of (a, b, c)) In addition, if there is a way that realize the latter assignment (assignment are done after all operations in the list are done), please let me know how to realize it.


Solution

  • The three operations update_a, update_b, and update_c have no interdependencies in the dataflow graph, so TensorFlow may choose to execute them in any order. (In the current implementation, it is possible that all three of them will be executed in parallel on different threads.) A second nit is that reads of variables are cached by default, so in your program the value assigned in update_b (i.e. c + a) may use the original or the updated value of a, depending on when the variable is first read.

    If you want to ensure that the operations happen in a particular order, you can use with tf.control_dependencies([...]): blocks to enforce that operations created within the block happen after operations named in the list. You can use tf.Variable.read_value() inside a with tf.control_dependencies([...]): block to make the point at which the variable is read explicit.

    Therefore, to if you want to ensure that update_a happens before update_b and update_b happens before update_c, you could do:

    update_a = tf.assign(a, b + c)
    
    with tf.control_dependencies([update_a]):
      update_b = tf.assign(b, c + a.read_value())
    
    with tf.control_dependencies([update_b]):
      update_c = tf.assign(c, a.read_value() + b.read_value())