Search code examples
artificial-intelligencemachine-learningprobabilityplanning

Aggregating Probabilistic Plans


I'm trying to create a simple STRIPS-based planner. I've completed the basic functionality to compute separate probabilistic plans that will reach a goal, but now I'm trying to determine how to aggregate these plans based on their initial action, to determine what the "overall" best action is at time t0.

Consider the following example. Utility, bounded between 0 and 1, represents how well the plan accomplishes the goal. CF, also bounded between 0 and 1, represents the certainty-factor, or the probability that performing the plan will result in the given utility.

Plan1: CF=0.01, Utility=0.7
Plan2: CF=0.002, Utility=0.9
Plan3: CF=0.03, Utility=0.03

If all three plans, which are mutually exclusive, start with the action A1, how should I aggregate them to determine the overall "fitness" for using action A1? My first thought is to sum the certainty-factors, and multiple that by the average of the utilities. Does that seem correct?

So my current result would look like:

fitness(A1) = (0.01 + 0.002 + 0.03) * (0.7 + 0.9 + 0.03)/3. = 0.02282

Or should I calculate the individual likely utilities, and average those?

fitness(A1) = (0.01*0.7 + 0.002*0.9 + 0.03*0.03)/3. = 0.00323

Is there a more theoretically sound way?


Solution

  • If you take action A1, then you have to decide which of the 3 plans to follow, which are mutually exclusive. At that point we can calculate that the expected utility of plan 1 is

    E[plan1] = Prob[plan1 succeeds]*utility-for-success 
               + Prob[plan1 fails]*utility-of-failure
             = .01*.7 + .99*0 //I assume 0
             = .007
    

    Similarly for the other 2 plans. But, since you can only choose one plan, the real expected utility (which I think is what you mean by "fitness") from taking action A1 is

    max(E[plan1],E[plan2],E[plan3]) = fitness(A1)