Search code examples
markov-modelsmdp

What is the meaning of Values row in POMDP?


I am studying POMDP file format and fallowing this and many other links. I have understood everything but I can't get what does the Value in second row of the file stand for. Its values are Reward or Cost. Can't find the answer elsewhere. Getting confused, because it should be possible to have costs AND rewards within one document, no?. Why do I have to specify one of them? Also nowhere in the rest of the file the value is not getting used.


Solution

  • In POMDPs you can use either rewards OR costs to define the learning goal. The only difference is that in the first case you try to maximize the value function, whereas for the cost you try to minimize the value function.

    In the POMDP file you can define which one you use:

    values: [ reward, cost ]
    

    When the solver reads the POMDP file, it will interpret the values defined with R: as either reward or cost.