Search code examples
python-3.xalgorithmtreegraph-theoryn-ary-tree

Number of distinct paths in tree, that have value of nodes in that path greater than or equal K


Problem Statement:
You are given an integer N denoting number of nodes in that tree. Now, you need to count, how many distinct paths are there in the tree, such that, minimum node value in that path is greater than or equal to k.

Input format:

  1. First line contains total number of nodes N in that tree and a positive integer value K.
  2. Next N-1 lines contain a pair of integers u, v (values are not comma seperated), which denote that there is an edge between node u and v, in the tree.

Example:

Input:

4 2  
1 2  
2 3  
3 4  

Expected Output:

3

Edit: I am not able to think of how to approach this problem. So, please provide me with a hint, so that i can further try implementing it. I'll be greatful to even the slightest help.

Update:

1 <= N <= 10^5
1 <= u,v <= N  
1 <= K <= 10^6  

A naive approach to such a problem will not pass under any cirumstances. The complexity of the solution should be either O(n**2) or O(nlogn).


Solution

  • This problem can be solved using Dynamic Programming on Trees, you can read about it here https://www.geeksforgeeks.org/dynamic-programming-trees-set-2/.

    Let's split the problem into two parts, the first one is to find the number of paths that are valid in the subtree of a node u. The second part is to find the answer for the node u if we do not consider its subtree but going to it's parent and so on.

    Let us consider 1 as the root of the tree.

    Now for solving the first part, we will make an array in[] in which we will store the the number of paths in the subtree of node u so in[u] will represent the number of valid paths starting from the node u and visiting its subtree. To compute this array we can run a simple dfs as follows :

    //u is the current node - p is the parent
    void dfs1(int u, int p) {
        for (int i = 0; i < g[u].size(); i++) { //iterate through the adjacency of node u
            int v = g[u][i]; // v is the child of u
            if (v != p) { //in case v is the parent just ignore it
                dfs1(v, u); //go to node v
                if (u >= k && v >= k) { //if value of u or v is less than k then we cant start at this node so it should be 0
                    in[u] += in[v] + 1; //add to number of paths of u the number of paths starting from the node v + 1 (+1 because we can also go from u to v)
                }
            }
        }
    }
    

    For solving the second part we need an array out[] with out[u] being the number of paths that are valid if we do not consider the subtree of u which is going to the parent of u and so on.

    lets call the parent of u P[u]. To compute out[u] we will rely on p[u]. out[i] is number of valid paths if we go to p[u], and what we can do once we reach p[u] is either go through its subtree (excluding the brach of u of course) or visit p[p[u]] (the parent of parent of u) so out[u] is (out[p[u]] + 1) + (in[p[u]] - in[u] - 1).

    // u is the current node - p is the parent
    void dfs2(int u, int p) {
        for (int i = 0; i < g[u].size(); i++) { // iterate through the adjacency of node u
            int v = g[u][i]; // v is the child of u
            if (v != p) { // in case v is the parent just ignore it
                if (v >= k && u >= k) { // if value of u or v is less than k then we cant start at this node so it should be 0
                    out[v] += (out[u] + 1); //add to the number of paths of v the number of paths from it's parent (u) + 1 if we dont use the subtree of u
                    out[v] += (in[u] - in[v] - 1); // add to the number of paths of v the number of paths from it's parent (u)
                    // we should subtract in[v] + 1 because we want to exclude this branch from the subtree of the parent
                }
                dfs2(v, u); // go to node v
            }
        }
    }
    

    To find the answer just sum all out[u] + in[u] for all the nodes and divide by 2 because each path was computed twice.

    complexity O(V+E)