Search code examples
openai-gym

What is the action_space for?


I'm making custom environment in OpenAI Gym and really don't understand, what is action_space for? And what should I put in it? Just to be accurate, I don't know what is action_space, I didn't used it in any code. And I didn't find anything on internet, what could answer my question normally.


Solution

  • The action_space used in the gym environment is used to define characteristics of the action space of the environment. With this, one can state whether the action space is continuous or discrete, define minimum and maximum values of the actions, etc.

    For continuous action space one can use the Box class.

    import gym 
    from gym import spaces
    class MyEnv(gym.Env):
        def __init__(self):
            # set 2 dimensional continuous action space as continuous
            # [-1,2] for first dimension and [-2,4] for second dimension 
            self.action_space = spaces.Box(np.array([-1,-2]),np.array([2,4]),dtype=np.float32)
    

    For discrete one can use the Discrete class.

    import gym 
    from gym import spaces
    class MyEnv(gym.Env):
        def __init__(self):
            # set 2 dimensional action space as discrete {0,1}
            self.action_space = spaces.Discrete(2)
    

    If you have any other requirements you can go through this folder in the OpenAI gym repo. You could also go through different environments given in the gym folder to get more examples of the usage of the action_space and observation_space.

    Also, go through core.py to get to know what all methods/functions are necessary for an environment to be compatible with gym.

        The main OpenAI Gym class. It encapsulates an environment with
        arbitrary behind-the-scenes dynamics. An environment can be
        partially or fully observed.
        The main API methods that users of this class need to know are:
            step
            reset
            render
            close
            seed
        And set the following attributes:
            action_space: The Space object corresponding to valid actions
            observation_space: The Space object corresponding to valid observations
            reward_range: A tuple corresponding to the min and max possible rewards
        Note: a default reward range set to [-inf,+inf] already exists. Set it if you want a narrower range.
        The methods are accessed publicly as "step", "reset", etc.. The
        non-underscored versions are wrapper methods to which we may add
        functionality over time.