Search code examples
python-3.xmatlabmatplotlibmatlab-figuremarkov-decision-process

MDP Policy Plot for a Maze


I have a 5x-5 maze specified as follows.

r = [1  0   1   1   1
     1  1   1   0   1
     0  1   0   0   1
     1  1   1   0   1
     1  0   1   0   1];

Where 1's are the paths and 0's are the walls.

Assume I have a function foo(policy_vector, r) that maps the elements of the policy vector to the elements in r. For example 1=UP, 2=Right, 3=Down, 4=Left. The MDP is set up such that the wall states are never realized so policies for those states are ignored in the plot.

policy_vector' = [3 2 2 2 3 2 2 1 2 3 1 1 1 2 3 2 1 4 2 3 1 1 1 2 2]
symbols' = [v > > > v > > ^ > v ^ ^ ^ > v > ^ < > v ^ ^ ^ > >]

I am trying to display my policy decision for a Markov Decision Process in the context of solving a maze. How would I plot something that looks like this? Matlab is preferable but Python is fine.

enter image description here

Even if some body could show me how to make a plot like this I would be able to figure it out from there.

enter image description here


Solution

  • enter image description here

    function[] = policy_plot(policy,r)
        [row,col] = size(r);
        symbols = {'^', '>', 'v', '<'};
        policy_symbolic = get_policy_symbols(policy, symbols);
        figure()
        hold on
        axis([0, row, 0, col])
        grid on
        cnt = 1;
        fill([0,0,col,col],[row,0,0,row],'k')
        for rr = row:-1:1
            for cc = 1:col
                if r(row+1 - rr,cc) ~= 0 && ~(row == row+1 - rr && col == cc)
                    fill([cc-1,cc-1,cc,cc],[rr,rr-1,rr-1,rr],'g')
                    text(cc - 0.55,rr - 0.5,policy_symbolic{cnt})
                end
                cnt = cnt + 1;
            end
        end
        fill([cc-1,cc-1,cc,cc],[rr,rr-1,rr-1,rr],'b')
        text(cc - 0.70,rr - 0.5,'Goal')
    
    function [policy_symbolic] = get_policy_symbols(policy, symbols)
        policy_symbolic = cell(size(policy));
        for ii = 1:length(policy)
            policy_symbolic{ii} = symbols{policy(ii)};
        end