Search code examples
deep-learningreinforcement-learningopenai-gym

gym.spaces.box Observation State Understanding


So i'm trying to perform some reinforcement learning in a custom environment using gym however I'm very confused as to how spaces.box works. What do each of the parameters mean? If I have a a game state that involves lots of information such as the hp of characters, their stats and abilities as an example, I'm not really sure something like this would be represented in Box as an observation state. Also in a game with a lot of abilities, would it be better to one-hot encode them or leave them as regular incremental Id's since i want to use a neural network to find expected Q values.


Solution

  • spaces.Box means that you are dealing with real-valued quantities.

    For example:

    action_space = spaces.Box(np.array([-1,0,1]), np.array([1,1,2]))

    Here the actions are 3-dimensional. Also, [-1,0,1] is the lowest accepted value and [1,1,2] is the highest accepted value.

    In essence, a=[a1,a2,a3],

    a1 is in the range [-1,1], a2 is in the range [0,1], a3 is in the range [1,2].

    If there are a lot of "abilities" with a huge variety then the state vector might become quite huge if one uses one-hot encoding. Hence it would be advisable to use regular incremental IDs. But normalise them to the range [0,1] so that the neural network activations don't saturate.