I can't find an exact description of the differences between the OpenAI Gym environments 'CartPole-v0' and 'CartPole-v1'.
Both environments have seperate official websites dedicated to them at (see 1 and 2), though I can only find one code without version identification in the gym github repository (see 3). I also checked out the what files exactly are loaded via the debugger, though they both seem to load the same aforementioned file. The only difference seems to be in the their internally assigned max_episode_steps
and reward_threshold
, which can be accessed as seen below. CartPole-v0 has the values 200/195.0 and CartPole-v1 has the values 500/475.0. The rest seems identical at first glance.
import gym
env = gym.make("CartPole-v1")
print(self.env.spec.max_episode_steps)
print(self.env.spec.reward_threshold)
I would therefore appreciate it if someone could describe the exact differences for me or forward me to a website that is doing so. Thank you very much!
As you probably have noticed, in OpenAI Gym sometimes there are different versions of the same environments. The different versions usually share the main environment logic but some parameters are configured with different values. These versions are managed using a feature called the registry.
In the case of the CartPole environment, you can find the two registered versions in this source code. As you can see in lines 50 to 65, there exist two CartPole versions, tagged as v0 and v1, whose differences are the parameters max_episode_steps
and reward_threshold
:
register(
id='CartPole-v0',
entry_point='gym.envs.classic_control:CartPoleEnv',
max_episode_steps=200,
reward_threshold=195.0,
)
register(
id='CartPole-v1',
entry_point='gym.envs.classic_control:CartPoleEnv',
max_episode_steps=500,
reward_threshold=475.0,
)
Both parameters confirm your guess about the difference between CartPole-v0 and CartPole-v1.