I have a directed weighted graph G=(V,E).
In this graph the weight of edge(v[i],v[j]) is the count of transition between v[i] and v[j].
I am trying to determine the best way to accomplish task: how to find the probability P of path from node A to node B
All weights are positive integers.
For example,
weight(A,B)=count of transition from A to B
weight(B,C)=count of transition from B to C
weight(B,D)=count of transition from B to D
weight(D,C)=count of transition from D to C
And we have several paths:
A->B->C
A->B->D->C
So, how to calculate probability P of these paths and choose the best one?
It can be solved by reducing the problem to shortest path problem, assuming we are indeed discussing probabilities (that means, each weight is in range [0,1]
).
Let the graph be G=(V,E)
, and the probability between two vertices denoted as w(u,v)
.
Define: w'(u,v) = -log(w(u,v))
The shortest path from some node s
to some node t
is then the shortest path in the graph when using w'
as weight function. You can find the shortest path using Dijkstra's algorithm or Bellman-Ford algorithm
Proof:
For a path v1->v2->...->vn
, its probability is w(v1,v2) * w(v2,v3) * ... * w(vn-1,vn)
.
When using w'
as the weight, the cost of this path when using any shortest path algorithm is:
d(v1,vn) = w'(v1,v2) + w'(v2,v3) + ... + w'(vn-1,vn) =
d(v1,vn) = -log(w(v1,v2)) + -log(w(v2,v3) + ... + -log(w(vn-1,vn)) =
d(v1,vn) = -1* [ log(w(v1,v2)) + log(w(v2,v3)) + ... + log(w(vn-1,vn)) =
d(v1,vn) = -1* log(w(v1,v2) * w(v2,v3) * ... * w(vn-1,vn))
This obviously applies also for the minimal path found from s
to t
.
This means that this path have minimal value of:
s(s,t) = -1* log(w(s,v2) * w(v2,v3) * ... * w(vn-1,t))
And since log is monotonic function, it is also minimal value of -1 * w(s,v2) * w(v2,v3) * ... * w(vn-1,t)
, which is the maximal value of w(s,v2) * w(v2,v3) * ... * w(vn-1,t)
, and that's exactly the probability.