Search code examples
algorithmgraphdepth-first-searchtopological-sort

Topological sort to find the number of paths to t


I have to develop an O(|V|+|E|) algorithm related to topological sort which, in a directed acyclic graph (DAG), determines the number of paths from each vertex of the graph to t (t is a node with out-degree 0). I have developed a modification of DFS as follow:

DFS(G,t):
    for each vertex u ∈ V do
        color(u) = WHITE
        paths_to_t(u) = 0
    for each vertex u ∈ V do
        if color(u) == WHITE then
            DFS-Visit(u,t)

DFS-Visit(u,t):
    color(u) = GREY
    for each v ∈ neighbors(u) do
        if v == t then
            paths_to_t(u) = paths_to_t(u) + 1
        else then
            if color(v) == WHITE then
                DFS-Visit(v)
            paths_to_t(u) = paths_to_t(u) + paths_to_t(v)
    color(u) = BLACK

But I am not sure if this algorithm is related to topological sort or if should I restructure my work with another point of view.


Solution

  • It can be done using Dynamic Programming and topological sort as follows:

    Topological sort the vertices, let the ordered vertices be v1,v2,...,vn
    create new array of size t, let it be arr
    init: arr[t] = 1
    for i from t-1 to 1 (descending, inclusive):
        arr[i] = 0  
        for each edge (v_i,v_j) such that i < j <= t:
             arr[i] += arr[j]
    

    When you are done, for each i in [1,t], arr[i] indicates the number of paths from vi to vt

    Now, proving the above claim is easy (comparing to your algorithm, which I have no idea if its correct and how to prove it), it is done by induction:

    Base: arr[t] == 1, and indeed there is a single path from t to t, the empty one.
    Hypothesis: The claim is true for each k in range m < k <= t
    Proof: We need to show the claim is correct for m.
    Let's look at each out edge from vm: (v_m,v_i).
    Thus, the number of paths to vt starting from v_m that use this edge (v_m,v_i). is exactly arr[i] (induction hypothesis). Summing all possibilities of out edges from v_m, gives us the total number of paths from v_m to v_t - and this is exactly what the algorithm do.
    Thus, arr[m] = #paths from v_m to v_t

    QED

    Time complexity:
    The first step (topological sort) takes O(V+E).
    The loop iterate all edges once, and all vertices once, so it is O(V+E) as well.
    This gives us total complexity of O(V+E)