Search code examples
algorithmdepth-first-searchbreadth-first-search

Compute distance using DFS


enter image description here

I was torn between these two methods:

M1:

  • Use adjacency list to represent graph G with vertices P and edges A
  • Use DFS on G storing all the distances from p in an array d;
  • Loop through d checking all entries. If some d[u] >6, return false otherwise true

M2:

  • Use adjacency list to represent graph G with vertices P and edges A
  • Use BFS on G storing all the distances from p in an array d;
  • Loop through d checking all entries. If some d[u] >6, return false otherwise true

Both these methods will produce a worst case O(|P| + |A|), therefore I think that both would be a correct answer to this question. I had chosen the DFS method, with the reasoning that with DFS you should be able to find the "outlier" of freedom degree 7 earlier than with BFS, since with BFS you would have to traverse every single Vertex until degree 7 in every case.

Apparently this is wrong according to the teacher, as using DFS, you can't compute the distances. I don't understand why you wouldn't be able to compute the distances. I could have a number n indicating the degree of freedom I am currently at. Starting from root p, the child would have n = 1. Now I store n in array d. Then I keep traversing down until no child is to be found, while incrementing n and storing the value in my array d. Then, if the back-tracking starts, the value n will be decremented until we find an unvisited child node of any of the visited nodes on the stack. If there is an unvisited child, increment once again, then increment until no more child is found, decrement until the next unvisited child from the stack is found...

I believe that would be a way to store the distances with DFS


Solution

  • Both BFS and DFS can do the job: they can both limit their search to a depth of 6, and at the end of the traversal they can check whether the whole population was reached or not. But there are some important differences:

    With BFS

    The BFS traversal is the algorithm I would opt for. When a BFS search determines the degree of a person, it is definitive: no correction needs to be made to it.

    Here is sketch of how you can do this with BFS:

    visited = set()  # empty set
    frontier = []  # empty array
    visited.add(p)  # search starts at person p
    frontier.append(p)
    for degree in [1, 2, 3, 4, 5, 6]:
        nextFrontier = []  # empty array
        for person in frontier:
            for acquaintance in A[person]:
                if acquaintance not in visited:
                    visited.add(acquaintance)
                    nextFrontier.append(acquaintance)
        frontier = nextFrontier
        if size(visited) == size(P):  # have we reached the whole population?
            return True
    # After six rounds we did not reach all people, so...
    return False
    

    This assumes that you can find the list of acquaintances for a given person via A[person]. If A is not structured like an adjacency list but as a list of pairs, then first do some preprocessing on the original A to create such an adjacency list.

    With DFS

    A DFS algorithm has as downside that it will not necessarily start with optimal paths, and so it will find that some persons have degree 6, while there really are shorter, uninvestigated paths that could improve on that degree. This means that a DFS algorithm may need to revisit nodes and even partial paths (edges) to register such improvements and cascade them through a visited path up to degree 6. And there might even be several improvements to be applied for the same person.

    A DFS algorithm could look like this:

    degreeOfPerson = dict()  # empty key/value dictionary
    for person in P:
        degreeOfPerson[person] = 7  # some value greater than 6
    
    function dfs(person, degree):
        if degree >= 7:
            return  # don't lose time for higher degrees than 6.
        for acquaintance in A[person]:
            if degree < degreeOfPerson[acquaintance]:  # improvement?
                degreeOfPerson[acquaintance] = degree
                dfs(acquaintance, degree+1)
    
    # start DFS
    degreeOfPerson[p] = 0
    dfs(p, 1)
    
    # Check if all persons got a degree of maximum 6
    for person in P:
        if degreeOfPerson[person] > 6:
            return False
    return True
    

    Example

    If the graph has three nodes, linked as a triangle a-b-c, with starting point a, then this would be the sequence. Indentation means (recursive) call of dfs:

    degreeOfPerson[a] = 0
        a->b: degreeOfPerson[b] = 1
            b->c: degreeOfPerson[c] = 2
                c->a: # cannot improve degreeOfPerson[a]. Backtrack
                c->b: # cannot improve degreeOfPerson[b]. Backtrack
            b->a: # cannot improve degreeOfPerson[a]. Backtrack
        a->c: degreeOfPerson[c] = 1  # improvement!
            c->a: # cannot improve degreeOfPerson[a]. Backtrack
            c->b: # cannot improve degreeOfPerson[b]. Backtrack
    

    Time Complexity

    The number of times the same edge can be visited with DFS is not more than the maximum degree we are looking for -- in your case 6. If that is a constant, then it does not affect the time complexity. If however the degree to check for is an input value, then the time complexity of DFS becomes O(maxdegree * |E| + |V|).