Search code examples
neural-networktensorreinforcement-learningtf.keraspolicy-gradient-descent

TypeError: tuple indices must be integers or slices, not NoneType


I need help regarding a TypeError when I'm trying to pass input in the Neural Network defined as:

env = gym.make("CartPole-v1",render_mode="rgb_array")
obs = env.reset()

n_inputs = env.observation_space.shape[0]

model = tf.keras.Sequential([
    tf.keras.layers.Dense(5, activation="relu"),
    tf.keras.layers.Dense(1, activation="sigmoid"),
])

def play_one_step(env, obs, model, loss_function):
    
    with tf.GradientTape() as tape:
        
        left_probability = model(obs[np.newaxis])
        action = (tf.random.uniform([1, 1]) > left_probability)
        y_target = tf.constant([[1.]]) - tf.cast(action, tf.float32)
        loss = tf.reduce_mean(loss_function(y_target, left_probability))
        
    gradients = tape.gradient(loss, model.trainable_variables)
    obs, reward, done, truncated, info = env.step(int(action))
    
    return obs, reward, done, truncated, gradients

But later when I call this function I get the error:

Cell In [158], line 30, in play_episodes(env, n_episodes, n_max_steps, model, loss_function)
     26 obs = env.reset()
     28 for step in range(n_max_steps):
---> 30     obs, reward, done, truncated, gradients = play_one_step(env, obs, model, loss_function)
     31     current_rewards.append(reward)
     32     current_gradients.append(gradients)

Cell In [158], line 7, in play_one_step(env, obs, model, loss_function)
      3 def play_one_step(env, obs, model, loss_function):
      5     with tf.GradientTape() as tape:
----> 7         left_probability = model(obs[np.newaxis])
      8         action = (tf.random.uniform([1, 1]) > left_probability)
      9         y_target = tf.constant([[1.]]) - tf.cast(action, tf.float32)

TypeError: tuple indices must be integers or slices, not NoneType

I,ve tried reshaping like obs[np.newaxis,:] but then it show neural network expecting min_ndims = 2 but got ndim = 1. Also tried converting it to tensor and then passing it to the model() but that also gives the above error only


Solution

  • Env.reset() returns two elements observation and info, you need unpack info into a variable

    obs, _ = env.reset()