Search code examples
pythonkeras

Can't convert non-rectangular Python sequence to Tensor error, when following the example from Keras website


I am following this example https://keras.io/examples/rl/ddpg_pendulum/

and I am getting an error on the line

I am using tensorflow Version: 2.10.0 and Gym 0.26.2

tf_prev_state = tf.expand_dims(tf.convert_to_tensor(prev_state), 0)
Exception has occurred: ValueError
Can't convert non-rectangular Python sequence to Tensor.
  File "C:\Users\vlad.nanu\Documents\GitHub\ml-hub\pendulum.py", line 236, in <module>
    tf_prev_state = tf.expand_dims(tf.convert_to_tensor(prev_state), 0)
ValueError: Can't convert non-rectangular Python sequence to Tensor.

Solution

  • Reading through the release notes of the latest gym release (0.26.x), you will find 2 breaking changes that affect the pendulum code regarding env.Step and env.Reset.

    You can change the three following lines (see #changed):

    for ep in range(total_episodes):
    
        prev_state, _ = env.reset() # changed
        episodic_reward = 0
    
        while True:
            # Uncomment this to see the Actor in action
            # But not in a python notebook.
            # env.render()
    
            tf_prev_state = tf.expand_dims(tf.convert_to_tensor(prev_state), 0)
    
            action = policy(tf_prev_state, ou_noise)
            # Recieve state and reward from environment.
            state, reward, terminated, truncated, info = env.step(action) # changed
    
            buffer.record((prev_state, action, reward, state))
            episodic_reward += reward
    
            buffer.learn()
            update_target(target_actor.variables, actor_model.variables, tau)
            update_target(target_critic.variables, critic_model.variables, tau)
    
            # End this episode when `done` is True
            if terminated or truncated: # changed
                break
    
            prev_state = state
    
        ep_reward_list.append(episodic_reward)
    
        # Mean of last 40 episodes
        avg_reward = np.mean(ep_reward_list[-40:])
        print("Episode * {} * Avg Reward is ==> {}".format(ep, avg_reward))
        avg_reward_list.append(avg_reward)