Search code examples
pythonmachine-learningreinforcement-learningopenai-gym

OpenAI gym and Python threading


I am working on a variation of A3C/ACER and I have several workers, each running on its own thread. I am using OpenAI gym environments.

Python threading works fine but it cannot fully utilize all cores. As there are no blocking I/O, it does not context switch.

I would like workers to somehow to release the GIL while executing actions in their respective environments.

I would appreciate your feedback: Does it make sense and it is possible?


Solution

  • Answering my own question: I found that a quite efficient way is demonstrated in OpenAI universe-starter-agent: https://github.com/openai/universe-starter-agent.

    The implementation uses Tensorflow and runs independent processes including a parameter server.

    I think this can be useful as a reference to other people too.