machine-learning pacman supervised-learning q-learning

Training a pacman agent using any supervised learning algorithm

I created a simple game of pacman(no power pills) and trained it using Q Learning algorithm. Now i am thinking about training it using some supervised learning algorithm.I could create a dataset by collecting state information and then storing it against an action taken by some human player and then training a classifier from it.My question is am i going in the right direction and is it the right approach to get the pacman move along the maze perfectly as it doesn't have any reward system ?

Solution

What would you use as state? Supervised learning is all about generalization. You define some parametrized model (e.g. a neural network) and then learn/estimate the parameters (e.g. the weights) from your data. Then you can use this model to predict something.

If all you have is a finite list of states (as you probably had with Q-Learning) and there is only a single "right" choice for each state (whatever the human teacher says). Then there is nothing to predict. There is no kind of "axis along which you can generalize". You only need a simple look-up table and a very patient human to fill it all up.

If you want to apply supervised learning, you need to put in some prior knowledge. You need have some kind of similarity measure (e.g. real-valued inputs/outputs - those have an inherent similarity for near-identical values) or create multiple instances of something.

For example, you could use a 3x3 grid around the player as input and predict the probability that a human player would move up/down/left/right in this situation. You could then try to mimic the human by choosing random moves with the predicted probability. Obviously, this approach will not move the pac-man perfectly, unless you use a very large grid (e.g. 20x20) at which point you are practically back again filling ones and zeroes into a simple look-up table.