SGD optimizer: Cannot iterate over a scalar tensor

Trying to execute code examples from the book, I get the similar issue for several examples. Probably, because I use new TensorFlow version. Minimal code to reproduce:

import tensorflow as tf
import numpy as np

X = tf.constant(np.linspace(-1, 1, 101), dtype=tf.float32)
Y = tf.constant(np.linspace(-1, 1, 101), dtype=tf.float32)
w = tf.Variable(0., name="weights", dtype=tf.float32)

cost = lambda: tf.square(Y - tf.multiply(X, w))

train_op =  tf.keras.optimizers.SGD(0.01)
#train_op =  tf.compat.v1.train.GradientDescentOptimizer(0.01)  # works with this optimizer

train_op.minimize(cost, w)   # error is here

Error:

Traceback (most recent call last):
  File "/home/alex/tmp/test.py", line 15, in 
    train_op.minimize(cost, w)
  File "/home/alex/.local/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 526, in minimize
    grads_and_vars = self.compute_gradients(loss, var_list, tape)
  File "/home/alex/.local/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 260, in compute_gradients
    return list(zip(grads, var_list))
  File "/home/alex/.local/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 583, in __iter__
    raise TypeError("Cannot iterate over a scalar tensor.")
TypeError: Cannot iterate over a scalar tensor.

If I replace tf.keras.optimizers.SGD with tf.compat.v1.train.GradientDescentOptimizer, this code is working as expected. How can I get it working with SGD optimizer?

Pytnon version is 3.10.6, TensorFlow version is 2.11.0:

alex@alex-22:~$ python3
Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux
>>> import tensorflow as tf
>>> print(tf.__version__)
2.11.0

This code is from "Machine Learning with TensorFlow" book, full example code is here: https://github.com/chrismattmann/MLwithTensorFlow2ed/blob/master/TFv2/ch03/Listing%203.01%20-%203.02.ipynb

Solution

It's probabely because the var_list parameter of optimizer.minimize expects list or tuple instead of scaler value. In the doc,

var_list: list or tuple of Variable objects to update to minimize loss, or a callable returning the list or tuple of Variable objects. Use callable when the variable list would otherwise be incomplete before minimize since the variables are created at the first time loss is called.

In your code, it would be

train_op.minimize(cost, [w])

Check the document, there are plenty of example of this.