networkx graph-theory probability path-finding

shared probability of all paths between two nodes in a weighted limited graph

I have a weighted graph G=(V,E)

In this graph the weight of edge(v[i],v[j]) is the probability(p) of a direct transition between v[i] and v[j] (0<p<1).

For exemple: graph exemple

I am trying to calculate the probability of transition between 2 nodes v, u.

What I did so far:

I found all paths from v to u (using all_simple_paths )

[v -0.1-> a -0.2-> c -0.3-> u]

[v -0.1-> a -0.2-> c -0.2-> b -0.4-> d -0.5-> u]

[v -0.1-> b -0.2-> c -0.3-> u]

[v -0.1-> b -0.4-> d -0.5-> u]

I an calculate the probability of any individual path (by multply edges), but I don't sure how to union the probability of the different paths with shared nodes. I think I can do that by union and intersection between paths, but I'm searching after a simple programable algorithm.

Solution

To calculate the probability of all the events in a path occurring, multiply the probabilities of the individual events.

e.g. Two transitions of even odds will both happen 1 time in 4

To calculate the probability of one or more of several independent events occurring p(A or B) = p(A) + p(B) – p(A and B).

Your example:

p(D) = 0.5
p(C) = 0.3
p(B) = 0.2 + 0.06 - 0.012 = 0.248
p(A) = 0.3 * 0.2 = 0.06
p(V) = 0.0248 + 0.006 - 0.0001488 = 0.0306512

The psuedocode algorithm looks like this

- Mark all nodes 'uncalculated'
- LOOP over nodes
    - IF node has outlinks but no inlinks
        - LOOP over all paths from node to end node
            - LOOP PN over nodes in path
                 - LOOP I over inlinks to PN
                     - IF source node on I  'uncalculated'
                         - BREAK out of PN loop
                     - SET IP(I) = I probability * source node probability
                 - ENDLOOP I 
                 - IF 0 inlinks
                     - SET PN probability = 1
                 - IF 1 inlinks
                     - SET PN probability = IP[0]
                 - IF 2 inlinks
                     - SET PN probability = IP[0] + IP[1] - IP[0] * IP[1]
                 - IF inlinks > 2
                     - ask user to refactor input
                     - STOP
                 - IF PN = end node
                     - OUTPUT end node probability
                     - STOP

Here is the C++ code to do this ( it is far too complex to be coded in Python ):

void cPathFinder::probs()
        {
            // The probabilities are stored in the link and node attribute myCost
            // Mark all node probabilities as 'not yet calculated'
            for (auto &n : myG)
                n.second.myCost = -1;

            // loop over nodes
            for (auto &n : myG)
            {
                // check for possible starting node
                // i.e one with out edges and no in edges
                if (!(outDegree(n.second) && (!inDegree(n.second))))
                    continue;

                // iterate over all paths from starting node to target node
                visitAllPaths(
                    node(n.second),
                    myEnd,
                    [&, this](int length) -> int
                    {
                        // loop over nodes in path
                        for (int n : myPath)
                        {
                            if (n < 0)
                                continue;

                            // loop over inlinks
                            std::vector<double> vprob;
                            bool fOK = true;
                            for (const auto l : inlinks(n))
                            {
                                double prevNodeProb = node(l.first.first).myCost;
                                if (prevNodeProb == -1)
                                {
                                    // the previous node probability has not been calculated yet
                                    // no need to look at any more inlinks
                                    fOK = false;
                                    break;
                                }
                                // store the probability contribution from this inlink
                                // it is the product of the source node proabability and the link probability
                                vprob.push_back(
                                    prevNodeProb * l.second->myCost);
                            }
                            // check if there is enough information
                            // to calculate the probability for this node
                            if (!fOK)
                                break;

                            // all the previous nodes are calculated
                            // calculate this node's probability
                            switch (vprob.size())
                            {
                            case 0:
                                //starting node, assume probability of 100%
                                node(n).myCost = 1;
                                break;

                            case 1:
                                // one inlink, prob is previous node prob times link probability
                                node(n).myCost = vprob[0];
                                break;

                            case 2:
                                // two inlinks
                                node(n).myCost =
                                    vprob[0] + vprob[1] - vprob[0] * vprob[1];
                                break;
                                
                            default:
                                /*  More then two inlinks
                                The code does not handle this
                                but note that multiple inlinks can alwayd be reduced to a series of
                                nodes with 2 inlinks
                                */
                                throw std::runtime_error(
                                    userName(n) + " has more than 2 inlinks, please refactor input");
                            }
                        }

                        return 0;
                    });
            }

            std::stringstream ss;
            ss << "\nfinal node " << userName(myEnd) << " probability "
               << node(myEnd).myCost << "\n";
            myResults = ss.str();
        }

Output from sample run for your example

input
l u c 0.3
l u d 0.5
l c a 0.2
l c b 0.2
l d b 0.4
l a v 0.1
l b v 0.1
e v

path: u c a v 
path: u c b v
path: u d b v

node probs
c 0.3
u 1
d 0.5
a 0.06
b 0.248
v 0.0306

final node v probability 0.0306

More complete documentation at https://github.com/JamesBremner/PathFinder3/wiki/Probabilities

The complete code for this application is at https://github.com/JamesBremner/PathFinder3