I'm trying to create an environment for my reinforcement learning algorithm, however, there seems a bit of a problem in case of when calling the PPOPolicy. For this I developed the following environment envFru
:
import gym
import os, sys
import numpy as np
import pandas as pd
from gym import spaces
import random
class envFru(gym.Env):
metadata ={'render.modes': ['human']}
def __init__(self):
self.df = df
self.action_space = spaces.Discrete(2)
self.observation_space = spaces.Box(low=np.array([0,0,0]), high=np.array([1,1,1]), dtype=np.float16)
def reset(self):
pass
def step(self, action):
pass
def _next_observation(self):
pass
def _take_action(self, action):
pass
def render(self, mode = 'human', close=False):
pass
from stable_baselines.common.vec_env import DummyVecEnv
from stable_baselines.common.policies import MlpPolicy
from stable_baselines2.ppo.ppo import PPO
envF = DummyVecEnv([lambda : envFru()])
model = PPOPolicy(envF, MlpPolicy, learning_rate= 0.001)
model.learn(total_timesteps=20000)
obs = env.reset()
for i in range(MAX_EPISODES):
action, _states = model.predict(obs)
obs, reward,done,info = env.step(action)
#env.render()
The traceback I'm getting is the following:
AttributeError Traceback (most recent call last)
<ipython-input-124-550b8c75c26b> in <module>
12 envF = DummyVecEnv([lambda : envFruit()])
13
---> 14 model = PPOPolicy(envF, MlpPolicy, learning_rate= 0.001)
15 model.learn(total_timesteps=20000)
16
~\Desktop\ImitationLearning\stable_baselines2\ppo\policies.py in __init__(self, observation_space, action_space, learning_rate, net_arch, activation_fn, adam_epsilon, ortho_init, log_std_init)
29 ortho_init=True, log_std_init=0.0):
30 super(PPOPolicy, self).__init__(observation_space, action_space)
---> 31 self.obs_dim = self.observation_space.shape[0]
32
33 # Default network architecture, from stable-baselines
AttributeError: 'DummyVecEnv' object has no attribute 'shape'
Are you sure, this is your actual code? In the code snippet above, the name PPOPolicy
is not even defined. We would need to see the code of PPOPolicy
. Obviously its constructor (its __init__
method) expects something as its first argument which has a shape
arttribute - so I guess, it expects a pandas
dataframe. Your envF
does not have a shape
attribute, so this leads to the error.
Just judging from the names in your snippet, I guess you should write
model = PPOPolicy(
envF.observation_space,
envF.action_space,
MlpPolicy,
learning_rate=0.001
)
at the relevant line.
My assumption stems from the error message
super(PPOPolicy, self).init(observation_space, action_space)
telling us, that the constructor of PPOPolicy
passes two variables named observation_space
and action_space
to its super()
constructor. And since these names reappear on your environment, I guess, this is the problem here. But as long as we don't see the correct and full code, this is just navigating through the fog.
Maybe it will help you to learn something about how to read an error message. This might help you for future problems. So, I suggest you read something like https://www.tutorialsteacher.com/python/error-types-in-python