Search code examples
pythondeep-learningreinforcement-learningopenai-gym

Observation with different boundaries. The observation returned by the `reset()` method does not match the given observation space


I'm a beginner in reinforcement learning, so don't judge me harshly.

error: AssertionError: The observation returned by the reset() method does not match the given observation space

observation_space:

self.observation_space = gym.spaces.Tuple((
            gym.spaces.Box(low=-float('inf'), high=self.fp.HEIGHT, shape=(1,), dtype=np.float64), # player y
            gym.spaces.Box(low=0, high=self.fp.WIDTH + self.fp.MIN_PIPE_GAP + self.fp.PIPE_WIDTH, shape=(2,), dtype=np.float64), # pipes x
            gym.spaces.Box(low=-float('inf'), high=float('inf'), shape=(1,), dtype=np.float64), # gravity
            gym.spaces.Box(low=-(self.fp.HEIGHT / 4 * 3 + self.fp.MIN_PIPE_GAP + 100), high=self.fp.HEIGHT / 4 * 3 + self.fp.MIN_PIPE_GAP + 100, shape=(4,), dtype=np.float64), # pipes y
            gym.spaces.Box(low=self.fp.PX, high=self.fp.PX, shape=(1,), dtype=np.float64) # player x
        ))

returned observation:

return (
            np.array([float(self.py)]),  # py
            np.array([float(self.pipes[ind]['x']), float(self.pipes[ind + 1]['x'])]),  # x1 x2
            np.array([float(self.gravity)]),  # gravity
            np.array([float(self.pipes[ind]['y1']), float(self.pipes[ind]['y2']), float(self.pipes[ind + 1]['y1']), float(self.pipes[ind + 1]['y2'])]), # y1 y2 y3 y4
            np.array([float(self.PX)])  # px
        )

I tried to put everything in one array (it worked), but it's wrong, because different data groups need different boundaries. Most likely, the error is in the wrong format, if according to you everything is correct in it, then I will try to find the error in the borders


Solution

  • The error turned out to be within the boundaries. But in the end, checker advised using Dict, so I just rewrote the code like this:

    observation_space:

    self.observation_space = gym.spaces.Dict({
                "player_y": gym.spaces.Box(low=-float('inf'), high=self.fp.HEIGHT, shape=(1,), dtype=np.float64), # player y
                "pipes_x": gym.spaces.Box(low=0, high=self.fp.WIDTH * 3, shape=(2,), dtype=np.float64), # pipes x
                "gravity": gym.spaces.Box(low=-float('inf'), high=float('inf'), shape=(1,), dtype=np.float64), # gravity
                "pipes_y": gym.spaces.Box(low=-(self.fp.HEIGHT / 4 * 3 + self.fp.MIN_PIPE_GAP + 100), high=self.fp.HEIGHT / 4 * 3 + self.fp.MIN_PIPE_GAP + 100, shape=(4,), dtype=np.float64), # pipes y
                "player_x": gym.spaces.Box(low=self.fp.PX, high=self.fp.PX, shape=(1,), dtype=np.float64) # player x
            })
    

    return:

    return {
                "player_y": np.array([float(self.py)]),  # py
                "pipes_x": np.array([float(self.pipes[ind]['x']), float(self.pipes[ind + 1]['x'])]),  # x1 x2
                "gravity": np.array([float(self.gravity)]),  # gravity
                "pipes_y": np.array([float(self.pipes[ind]['y1']), float(self.pipes[ind]['y2']), float(self.pipes[ind + 1]['y1']), float(self.pipes[ind + 1]['y2'])]), # y1 y2 y3 y4
                "player_x": np.array([float(self.PX)])  # px
            }