I have a bunch of Java code that constitutes an environment and an agent. I want to use one of the Python reinforcement learning libraries (stable-baselines, tf-agents, rllib, etc.) to train a policy for the Java agent/environment. And then deploy the policy on the Java side for production. Is there standard practice for incorporating other languages into Python RL libraries? I was thinking of one of the following solutions:
Which one would be better? Are there any other ways?
Edit: I ended up going the former - deploying a web server that encapsulates the environments. works quite well for me. Leaving the question open in case there is a better practice to handle this kind of situations!
The first approach is fine. RLLib implemented it the same way for the PolicyServerInput. Which is used for external Envs. https://github.com/ray-project/ray/blob/82465f9342cf05d86880e7542ffa37676c2b7c4f/rllib/env/policy_server_input.py
So take a look into their implementation. It uses Python data serialization, so I guess an own impl would be best to connect to Java.