python python-3.x tensorflow machine-learning gradient-descent

gradient descent using tensors calculating wrong values

I am implementing simple gradient descent algorithm using tensors. It learns two parameters m and c.
The normal python code for it is :

for i in range(epochs): 
    Y_pred = m*X + c  # The current predicted value of Y
    D_m = (-2/n) * sum(X * (Y - Y_pred))  # Derivative wrt m
    D_c = (-2/n) * sum(Y - Y_pred)  # Derivative wrt c
    m = m - L * D_m  # Update m
    c = c - L * D_c  # Update c
    print (m, c)

output for python :

0.7424335285442664 0.014629895049575754
1.1126970531591416 0.021962519495058154
1.2973530613155333 0.025655870599552183
1.3894434413955663 0.027534253868790198
1.4353697670010162 0.028507481513901086

Tensorflow equivalent code :

#Graph of gradient descent
y_pred = m*x + c
d_m = (-2/n) * tf.reduce_sum(x*(y-y_pred)) 
d_c = (-2/n) * tf.reduce_sum(y-y_pred)  
upm = tf.assign(m, m - learning_rate * d_m)
upc = tf.assign(c, c - learning_rate * d_c)

#starting session
sess = tf.Session()

#Training for epochs
for i in range(epochs):
    sess.run(y_pred)
    sess.run(d_m)
    sess.run(d_c)
    sess.run(upm)
    sess.run(upc)
    w = sess.run(m)
    b = sess.run(c)
    print(w,b)

Output for tensorflow :

0.7424335285442664 0.007335550424492317
1.1127687194584988 0.011031122807663662
1.2974962163433057 0.012911024540805463
1.3896400798226038 0.013885244876397126
1.4356019721347115 0.014407698787092268

The parameter m has the same value for both but parameter c has different value for both although the implementation is same for both.
The output contains first 5 values of parameter m and c. The output of parameter c using tensors is approximately half of the normal python.
I don't know where my mistake is.

For recreating the entire output: Repo containing data along with both implementations

The repo also contains image of graph obtained through tensorboard in events directory

Solution

The problem is that, in the TF implementation, the updates are not being performed atomically. In other words, the implementation of the algorithm is updating m and c in an interleaved manner (e.g. the new value of m is being used when updating c). To make the updates atomic, you should simultaneously run upm and upc:

sess.run([upm, upc])