I am following this example https://keras.io/examples/rl/ddpg_pendulum/
and I am getting an error on the line
I am using tensorflow Version: 2.10.0 and Gym 0.26.2
tf_prev_state = tf.expand_dims(tf.convert_to_tensor(prev_state), 0)
Exception has occurred: ValueError
Can't convert non-rectangular Python sequence to Tensor.
File "C:\Users\vlad.nanu\Documents\GitHub\ml-hub\pendulum.py", line 236, in <module>
tf_prev_state = tf.expand_dims(tf.convert_to_tensor(prev_state), 0)
ValueError: Can't convert non-rectangular Python sequence to Tensor.
Reading through the release notes of the latest gym release (0.26.x), you will find 2 breaking changes that affect the pendulum code regarding env.Step
and env.Reset
.
You can change the three following lines (see #changed
):
for ep in range(total_episodes):
prev_state, _ = env.reset() # changed
episodic_reward = 0
while True:
# Uncomment this to see the Actor in action
# But not in a python notebook.
# env.render()
tf_prev_state = tf.expand_dims(tf.convert_to_tensor(prev_state), 0)
action = policy(tf_prev_state, ou_noise)
# Recieve state and reward from environment.
state, reward, terminated, truncated, info = env.step(action) # changed
buffer.record((prev_state, action, reward, state))
episodic_reward += reward
buffer.learn()
update_target(target_actor.variables, actor_model.variables, tau)
update_target(target_critic.variables, critic_model.variables, tau)
# End this episode when `done` is True
if terminated or truncated: # changed
break
prev_state = state
ep_reward_list.append(episodic_reward)
# Mean of last 40 episodes
avg_reward = np.mean(ep_reward_list[-40:])
print("Episode * {} * Avg Reward is ==> {}".format(ep, avg_reward))
avg_reward_list.append(avg_reward)