Search code examples
algorithmgraphdynamic-programmingdepth-first-searchshortest-path

Single Source - Single Destination Shortest Path (DFS + DP) - Time complexity?


Context:

I want to write an algorithm for finding the shortest(least cost) path from a fixed source to a fixed destination node in a weighted directed graph with non negative weights (can have cycles).

The following DFS + DP algorithm does the above.

  1. DO DFS - at each node ask what is the minimum cost from current node to the DST node.
  2. This is done by asking - out of each neighbour which one will give the minimum (cost_to_dst + edgecost_to_neighbour)
  3. After we find this minimum cost to DST we store it in a dp array, so that we don't have to compute it again for that node.
  4. To avoid going in infinite loop of recursive dfs calls, we limit the depth of calls to V(no. of total nodes) as the max length of the shortest path from SRC to DST can be V.
  5. So the base conditions for dfs/recursive calls are : i) if you reach DST, return 0. ii) if you exhaust V dfs calls, return INT_MAX. iii) if node has already been processed return dp[node]

Pseudocode:

int dfs(int u, int stops = V + 1, int &dst, vector<vector<pair<int, int>>> &adj, vector<int> &dp)
    {
        if(u == dst) // reached dst
            return 0; 
        if(stops == 0) // can't go on anymore
            return INT_MAX; 
        if(dp[u]!= -1)
            return dp[u];

        int res = INT_MAX;
        for(auto &[v, edgecost]: adj[u])
        {
            int vdstcost = dfs(v, stops - 1, dst, adj, dp); // min cost from v to dst
            if(vdstcost != INT_MAX)
                res = min(res, vdstcost + edgecost);
        }
        return dp[u] = res;
    }

Queries:

  1. What will be the Time Complexity of this code in terms of E (no. of edges) and V (no. of nodes)?

  2. Is this logically correct? i.e Will this algorithm give me the correct answer for every scenario within constraints?

The 4th point of introducing limit of depth V to recursive calls is making it difficult for me to analyse the time complexity. But I'm confident it will be somewhere between O(V+E) to O(VE).

Imagine a path --> A --->...----> B ---->...----> A. In such a case, the recursive calls will go into loops until (V) depth is exhausted. So the TC can't simply be O(V+E)

Edit - fixed pseudocode

The DP also needs to consider no. of stops left , as pointed in the below answer. below code is updated to use 2D DP


int dfs(int u, int stops, int dst, vector<vector<pair<int, int>>> &adj, vector<vector<int>> &dp)
{
    if(u == dst) // reached dst
        return 0; 
    if(stops == 0) // can't go on anymore
        return INT_MAX; 
    if(dp[u][stops]!= -1)
        return dp[u][stops];

    int res = INT_MAX;
    for(auto &[v, edgecost]: adj[u])
    {
        int vdstcost = dfs(v, stops - 1, dst, adj, dp); // min cost from v to dst
        if(vdstcost != INT_MAX)
            res = min(res, vdstcost + edgecost);
    }
    return dp[u][stops] = res;
}

Solution

  • If I understand the question correctly, you intend to have a working algorithm that only relies on memoization and depth-first traversal, without the notion of visited/not-visited, nor priority queues. But it doesn't always work:

    Issues

    The algorithm you've specified will not always find the path because it may store INT_MAX distances in dp[u] when actually there is a path from u to the target:

    • The memoization should not be keyed by u alone. The result depends on the number of stops that are still available. It might be that with just 2 stops available, there is no way to reach the target, but with 5 there is a way. So you need to memoize the results per node and per stops.

    Not a breaking issue, but:

    • in your pseudo code the default (initial) value for stops could be π‘‰βˆ’1, as you already give it a node as argument, which counts as one, leaving you only π‘‰βˆ’1 for further expansions (i.e. number edges to follow).

    Complexity analysis

    Once the above is fixed, we can consider this "bad" case:

    enter image description here

    We have 𝑛 nodes, where each has two outgoing edges, one to itself, and one to the "next" node, and where all weights are 1. Let 𝑣1 be the source and 𝑣𝑛 the target. Assume that the loop-edges are always visited first in the worst case. Then if we ignore memoization for a moment, a DFS starting at 𝑣1 will try these paths, all of length π‘‰βˆ’1 (i.e. with 𝑉 nodes):

    • 𝑣1 β†’ ... β†’ 𝑣1 β†’ 𝑣1 β†’ 𝑣1
    • 𝑣1 β†’ ... β†’ 𝑣1 β†’ 𝑣1 β†’ 𝑣2
    • 𝑣1 β†’ ... β†’ 𝑣1 β†’ 𝑣2 β†’ 𝑣2
    • 𝑣1 β†’ ... β†’ 𝑣1 β†’ 𝑣2 β†’ 𝑣3
    • 𝑣1 β†’ ... β†’ 𝑣2 β†’ 𝑣2 β†’ 𝑣2
    • 𝑣1 β†’ ... β†’ 𝑣2 β†’ 𝑣2 β†’ 𝑣3
    • ...
    • 𝑣1 β†’ 𝑣2 β†’ ... β†’ π‘£π‘›βˆ’1 β†’ 𝑣𝑛

    Only at the last attempt will the target be found and will all nodes receive a non-infinite assignment in the dp array.

    If we consider memoization to prune the tree, then at node 𝑣1 we will have 𝑛 different values of stops that can occur at that node. For 𝑣2 we will have one less possibility for stops, ...etc, up to For 𝑣𝑛 where stops can be nothing else than 0. So to to write to all these dp entries (i.e. the worst case), we will need 𝑛 + π‘›βˆ’1 + π‘›βˆ’2 + ... + 1 visits. All other node visits, coming via the alternative edge to the same node will benefit from memoization. So those count for the same number of operations.

    So we get a complexity for this scenario that is O(𝐸²).

    If you would limit to graphs without self-loops, then you can set up a similar reasoning with a graph that has an even number of nodes 𝑛 and where the nodes in an odd-even pair reference each other, like this:

    enter image description here

    Efficient algorithms

    Wikipedia lists some of the popular algorithms to find a shortest path in a graph. Even though Dijkstra's algorithm can solve the problem of finding shortest paths from a given vertex to all vertices in the graph, it can be used for a single destination as well, where it simply returns when that vertex has been found. It has a worst case time complexity of O(𝐸 + 𝑉log𝑉).