Search code examples
reinforcement-learningopenai-gym

How can I register a custom environment in OpenAI's gym?


I have created a custom environment, as per the OpenAI Gym framework; containing step, reset, action, and reward functions. I aim to run OpenAI baselines on this custom environment. But prior to this, the environment has to be registered on OpenAI gym. I would like to know how the custom environment could be registered on OpenAI gym? Also, Should I be modifying the OpenAI baseline codes to incorporate this?


Solution

  • You do not need to modify baselines repo.

    Here is a minimal example. Say you have myenv.py, with all the needed functions (step, reset, ...). The name of the class environment is MyEnv, and you want to add it to the classic_control folder. You have to

    • Place myenv.py file in gym/gym/envs/classic_control
    • Add to __init__.py (located in the same folder)

      from gym.envs.classic_control.myenv import MyEnv

    • Register the environment in gym/gym/envs/__init__.py by adding

      gym.envs.register(
           id='MyEnv-v0',
           entry_point='gym.envs.classic_control:MyEnv',
           max_episode_steps=1000,
      )
      

    At registration, you can also add reward_threshold and kwargs (if your class takes some arguments).
    You can also directly register the environment in the script you will run (TRPO, PPO, or whatever) instead of doing it in gym/gym/envs/__init__.py.

    EDIT

    This is a minimal example to create the LQR environment.

    Save the code below in lqr_env.py and place it in the classic_control folder of gym.

    import gym
    from gym import spaces
    from gym.utils import seeding
    import numpy as np
    
    class LqrEnv(gym.Env):
    
        def __init__(self, size, init_state, state_bound):
            self.init_state = init_state
            self.size = size 
            self.action_space = spaces.Box(low=-state_bound, high=state_bound, shape=(size,))
            self.observation_space = spaces.Box(low=-state_bound, high=state_bound, shape=(size,))
            self._seed()
    
        def _seed(self, seed=None):
            self.np_random, seed = seeding.np_random(seed)
            return [seed]
    
        def _step(self,u):
            costs = np.sum(u**2) + np.sum(self.state**2)
            self.state = np.clip(self.state + u, self.observation_space.low, self.observation_space.high)
            return self._get_obs(), -costs, False, {}
    
        def _reset(self):
            high = self.init_state*np.ones((self.size,))
            self.state = self.np_random.uniform(low=-high, high=high)
            self.last_u = None
            return self._get_obs()
    
        def _get_obs(self):
            return self.state
    

    Add from gym.envs.classic_control.lqr_env import LqrEnv to __init__.py (also in classic_control).

    In your script, when you create the environment, do

    gym.envs.register(
         id='Lqr-v0',
         entry_point='gym.envs.classic_control:LqrEnv',
         max_episode_steps=150,
         kwargs={'size' : 1, 'init_state' : 10., 'state_bound' : np.inf},
    )
    env = gym.make('Lqr-v0')