What is the action_space for?

I'm making custom environment in OpenAI Gym and really don't understand, what is action_space for? And what should I put in it? Just to be accurate, I don't know what is action_space, I didn't used it in any code. And I didn't find anything on internet, what could answer my question normally.

Solution

The action_space used in the gym environment is used to define characteristics of the action space of the environment. With this, one can state whether the action space is continuous or discrete, define minimum and maximum values of the actions, etc.

For continuous action space one can use the Box class.

import gym 
from gym import spaces
class MyEnv(gym.Env):
    def __init__(self):
        # set 2 dimensional continuous action space as continuous
        # [-1,2] for first dimension and [-2,4] for second dimension 
        self.action_space = spaces.Box(np.array([-1,-2]),np.array([2,4]),dtype=np.float32)

For discrete one can use the Discrete class.

import gym 
from gym import spaces
class MyEnv(gym.Env):
    def __init__(self):
        # set 2 dimensional action space as discrete {0,1}
        self.action_space = spaces.Discrete(2)

If you have any other requirements you can go through this folder in the OpenAI gym repo. You could also go through different environments given in the gym folder to get more examples of the usage of the action_space and observation_space.

Also, go through core.py to get to know what all methods/functions are necessary for an environment to be compatible with gym.

    The main OpenAI Gym class. It encapsulates an environment with
    arbitrary behind-the-scenes dynamics. An environment can be
    partially or fully observed.
    The main API methods that users of this class need to know are:
        step
        reset
        render
        close
        seed
    And set the following attributes:
        action_space: The Space object corresponding to valid actions
        observation_space: The Space object corresponding to valid observations
        reward_range: A tuple corresponding to the min and max possible rewards
    Note: a default reward range set to [-inf,+inf] already exists. Set it if you want a narrower range.
    The methods are accessed publicly as "step", "reset", etc.. The
    non-underscored versions are wrapper methods to which we may add
    functionality over time.