I'm trying to get some insights into reinforcement learning while using openAI gym as a learning environment. I do this by reading the book Hands-on reinforcement learning with Python. In this book, some code is provided. Often, the code doesn't work, because I have to unwrap it first, as shown in: openai gym env.P, AttributeError 'TimeLimit' object has no attribute 'P'
However, I personally am still interested in the WHY of this unwrapping. Why do you need to unwrap? What does this do exactly? And why isn't it coded like that in the book? Is it outdated software as Giuliov assumed?
Thanks in advance.
Open AI Gym offers many different environments. Each of them with their own set of parameters and methods. Nevertheless they generally are wrapped by a single Class (like an interface on real OOPLs) called Env
. This class exposes the common most essential methods of any environment, like step
, reset
and seed
. Having this “interface” class is great, because it allows your code to be environment agnostic. It is also makes things easier if you want to test a single agent on different environments.
However, if you want to access the behind-the.scenes dynamics of a specific environment, then you use the unwrapped
property.