python-3.x matlab matplotlib matlab-figure markov-decision-process

MDP Policy Plot for a Maze

I have a 5x-5 maze specified as follows.

r = [1  0   1   1   1
     1  1   1   0   1
     0  1   0   0   1
     1  1   1   0   1
     1  0   1   0   1];

Where 1's are the paths and 0's are the walls.

Assume I have a function foo(policy_vector, r) that maps the elements of the policy vector to the elements in r. For example 1=UP, 2=Right, 3=Down, 4=Left. The MDP is set up such that the wall states are never realized so policies for those states are ignored in the plot.

policy_vector' = [3 2 2 2 3 2 2 1 2 3 1 1 1 2 3 2 1 4 2 3 1 1 1 2 2]
symbols' = [v > > > v > > ^ > v ^ ^ ^ > v > ^ < > v ^ ^ ^ > >]

I am trying to display my policy decision for a Markov Decision Process in the context of solving a maze. How would I plot something that looks like this? Matlab is preferable but Python is fine.

Even if some body could show me how to make a plot like this I would be able to figure it out from there.

Solution

function[] = policy_plot(policy,r)
    [row,col] = size(r);
    symbols = {'^', '>', 'v', '<'};
    policy_symbolic = get_policy_symbols(policy, symbols);
    figure()
    hold on
    axis([0, row, 0, col])
    grid on
    cnt = 1;
    fill([0,0,col,col],[row,0,0,row],'k')
    for rr = row:-1:1
        for cc = 1:col
            if r(row+1 - rr,cc) ~= 0 && ~(row == row+1 - rr && col == cc)
                fill([cc-1,cc-1,cc,cc],[rr,rr-1,rr-1,rr],'g')
                text(cc - 0.55,rr - 0.5,policy_symbolic{cnt})
            end
            cnt = cnt + 1;
        end
    end
    fill([cc-1,cc-1,cc,cc],[rr,rr-1,rr-1,rr],'b')
    text(cc - 0.70,rr - 0.5,'Goal')

function [policy_symbolic] = get_policy_symbols(policy, symbols)
    policy_symbolic = cell(size(policy));
    for ii = 1:length(policy)
        policy_symbolic{ii} = symbols{policy(ii)};
    end