I am building an environment in the maintained fork of gym
: Gymnasium
by Farama. In my gym environment
, I state that the action_space = gym.spaces.Discrete(5)
and the observation_space = gym.spaces.MultiBinary(25)
. Running the environment with the agent-environment loop suggested on the Gym Basic Usage website runs with no problems: I registered the environment and it is simply callable by gym.make()
.
However, I want to now train a reinforcement learning agent on this environment. Now I have come across Stable Baselines3
, which makes a DQN agent implementation fairly easy. However, it does seem to support the new Gymnasium
. Namely:
import gymnasium as gym
from stable_baselines3.ppo.policies import MlpPolicy
from stable_baselines3 import DQN
env = gym.make("myEnv")
model = DQN(MlpPolicy, env, verbose=1)
Yes I know, "myEnv" is not reproducable, but the environment itself is too large (along with the structure of the file system), but that is not the point of this question
This code produces an error:
AssertionError: The algorithm only supports (<class 'gym.spaces.discrete.Discrete',) as action spaces but Discrete(5) was provided
My question is the following: does Stable Baselines3 support Gymnasium
?
I have tried to instead use gym.spaces
in order to define the action_space
and observation_space
, such that
from gym.spaces import Discrete, MultiBinary
action_space = Discrete(5)
observation_space = MultiBinary(25)
but along with this, I have to rewrite a large portion of the environment to support the old gym
package. I wonder whether there is a better solution than that.
I was a bit confused by the other answer here as I was sure I'd seen gymnasium in the Stable Baselines 3 docs somewhere. Sure enough, it's even in the most basic "getting started" example: https://stable-baselines3.readthedocs.io/en/master/guide/quickstart.html.
However, I just tried running exactly this code and received the same error as the OP. Replacing gymnasium with gym 0.21 in the same example works without a problem.
Edit: The "getting started" example using gymnasium works with stable_baselines3 version 2.0.0a1 and above.