python reinforcement-learning openai-gym

AttributeError: 'DummyVecEnv' object has no attribute 'shape'

I'm trying to create an environment for my reinforcement learning algorithm, however, there seems a bit of a problem in case of when calling the PPOPolicy. For this I developed the following environment envFru:

import gym
import os, sys
import numpy as np
import pandas as pd
from gym import spaces
import random

class envFru(gym.Env):
    metadata ={'render.modes': ['human']}
    
    def __init__(self):
        self.df = df
        
        self.action_space = spaces.Discrete(2)
        self.observation_space = spaces.Box(low=np.array([0,0,0]), high=np.array([1,1,1]), dtype=np.float16)
    
    
    def reset(self):
        pass    
    
    def step(self, action):
        pass
    
    def _next_observation(self):
        pass
    
    def _take_action(self, action):
        pass
        
    def render(self, mode = 'human', close=False):
        pass

from stable_baselines.common.vec_env import DummyVecEnv
from stable_baselines.common.policies import MlpPolicy
from stable_baselines2.ppo.ppo import PPO


envF = DummyVecEnv([lambda : envFru()])

model = PPOPolicy(envF, MlpPolicy, learning_rate= 0.001)
model.learn(total_timesteps=20000)

obs = env.reset()
for i in range(MAX_EPISODES):
    action, _states = model.predict(obs)
    obs, reward,done,info = env.step(action)
    #env.render()

The traceback I'm getting is the following:

AttributeError                            Traceback (most recent call last)
<ipython-input-124-550b8c75c26b> in <module>
     12 envF = DummyVecEnv([lambda : envFruit()])
     13 
---> 14 model = PPOPolicy(envF, MlpPolicy, learning_rate= 0.001)
     15 model.learn(total_timesteps=20000)
     16 

~\Desktop\ImitationLearning\stable_baselines2\ppo\policies.py in __init__(self, observation_space, action_space, learning_rate, net_arch, activation_fn, adam_epsilon, ortho_init, log_std_init)
     29                  ortho_init=True, log_std_init=0.0):
     30         super(PPOPolicy, self).__init__(observation_space, action_space)
---> 31         self.obs_dim = self.observation_space.shape[0]
     32 
     33         # Default network architecture, from stable-baselines

AttributeError: 'DummyVecEnv' object has no attribute 'shape'

Solution

Are you sure, this is your actual code? In the code snippet above, the name PPOPolicy is not even defined. We would need to see the code of PPOPolicy. Obviously its constructor (its __init__ method) expects something as its first argument which has a shape arttribute - so I guess, it expects a pandas dataframe. Your envF does not have a shape attribute, so this leads to the error.

Just judging from the names in your snippet, I guess you should write

model = PPOPolicy(
    envF.observation_space, 
    envF.action_space, 
    MlpPolicy, 
    learning_rate=0.001
)

at the relevant line.

My assumption stems from the error message

super(PPOPolicy, self).init(observation_space, action_space)

telling us, that the constructor of PPOPolicy passes two variables named observation_space and action_space to its super() constructor. And since these names reappear on your environment, I guess, this is the problem here. But as long as we don't see the correct and full code, this is just navigating through the fog.

Maybe it will help you to learn something about how to read an error message. This might help you for future problems. So, I suggest you read something like https://www.tutorialsteacher.com/python/error-types-in-python