Search code examples
pythonreinforcement-learning

Too many values in Observation space: Box


I am running the gym tutorial in OpenAI and getting stuck right at the get go. Upon running the "MountainCar-v0" environment, I'm expecting the Obs space to have 2 values but I am getting 6.

`import gym
env = gym.make('MountainCar-v0')
obs_space = env.observation_space
action_space = env.action_space
print("The observation space: {}".format(obs_space))
print("The action space: {}".format(action_space))`

While I am expecting to get:

OUTPUT:
The observation space: Box(2,)
The action space: Discrete(3)

I am getting:

The observation space: Box([-1.2  -0.07], [0.6  0.07], (2,), float32)
The action space: Discrete(3)

This is causing an error downstream when trying to generate new obs:

obs = env.reset()
random_action = env.action_space.sample()
new_obs, reward, done, info = env.step(random_action)
print("The new observation is {}".format(new_obs))

I get the following error:

ValueError                                Traceback (most recent call last)
Input In [4], in <cell line: 11>()
      8 random_action = env.action_space.sample()
     10 # # Take the action and get the new observation space
---> 11 new_obs, reward, done, info = env.step(random_action)
     12 print("The new observation is {}".format(new_obs))

ValueError: too many values to unpack (expected 4)

Solution

  • I am assuming that you are using the latest version of gym which is 0.26.2

    Based on the documentation given here, your error can be resolved by -

    import gym
    env = gym.make('MountainCar-v0')
    obs_space = env.observation_space
    action_space = env.action_space
    obs = env.reset()
    random_action = env.action_space.sample()
    observation, reward, terminated, truncated, info = env.step(random_action)