artificial-intelligence machine-learning reinforcement-learning

Learning the Structure of a Hierarchical Reinforcement Task

I've been studying hierachial reinforcement learning problems, and while a lot of papers propose interesting ways for learning a policy, they all seem to assume they know in advance a graph structure describing the actions in the domain. For example, The MAXQ Method for Hierarchial Reinforcement Learning by Dietterich describes a complex graph of actions and sub-tasks for a simple Taxi domain, but not how this graph was discovered. How would you learn the hierarchy of this graph, and not just the policy?

Solution

In Dietterich's MAXQ, the graph is constructed manually. It's considered to be a task for the system designer, in the same way that coming up with a representation space and reward functions are.

Depending on what you're trying to achieve, you might want to automatically decompose the state space, learn relevant features, or transfer experience from simple tasks to more complex ones.

I'd suggest you just start reading papers that refer to the MAXQ one you linked to. Without knowing what exactly what you want to achieve, I can't be very prescriptive (and I'm not really on top of all the current RL research), but you might find relevant ideas in the work of Luo, Bell & McCollum or the papers by Madden & Howley.