I am writing the code for Autonomous Driving using RL. I am using a stable baseline3 and an open ai gym environment. I was running the following code in the jupyter notebook and it is giving me the following error:
# Testing our model
episodes = 5 # test the environment 5 times
for episodes in range(1,episodes+1): # looping through each episodes
bs = env.reset() # observation space
# Taking the obs and passing it through our model
# tells that which kind of the action is best for our work
done = False
score = 0
while not done:
env.render()
action, _ = model.predict(obs) # now using model here # returns model action and next
state
# take that action to get the best reward
# for observation space we get the box environment
# rather than getting random action we are using model.predict(obs) on our obs for an
curr env to gen the action inorder to get best possible reward
obs, reward, done, info = env.step(action) # gies state, reward whose value is 1
# reward is 1 for every step including the termination step
score += reward
print('Episode:{},Score:{}'.format(episodes,score))'''
env.close()
The link for the code that I have written is given below: https://drive.google.com/file/d/1JBVmPLn-N1GCl_Rgb6-qGMpJyWvBaR1N/view?usp=sharing
The version of python I am using is Python 3.8.13 in Anaconda Environment. I am using Pytorch CPU version and the OS is Windows 10. Please help me out in solving this question.
Using .copy()
for numpy arrays should help (because PyTorch tensors can't handle negative strides):
action, _ = model.predict(obs.copy())
I haven't managed to run your notebook quickly because of dependencies problems, but I had the same error with AI2THOR simulator, and adding .copy()
has helped.
Maybe someone with more technical knowledge about numpy
, torch
or AI2THOR will explain why the error occurs in more detail.