Search code examples
pythonalgorithmconstraintsdepth-first-searchbacktracking

Depth-First-Search, Backtracking when constraint failed


So, I do understand depth-first-search is not appropriate for this problem and something like UCS or Astar would be a lot better, just attempting to see if it is possible with a DFS approach.

I need to find a path within a cost budget, my approach using dfs is to keep the cost to get to the next node in the stack as they get pushed, and if going to the next node exceeds, it ignores does not push.

The problem I am facing is that when the budget is exceeded, I am struggling to find a way to set the nodes that have led to this path to be set back as unvisited so newer paths can consider them. I feel it has something to do with setting visited/unvisited properly and have tested a couple but in a bigger graph input, it always fails to find a path within constraint (Have confirmed that it exist using other methods)

def DFS(start, end, budget, adj_list, dist_list, cost_list):
# TODO Depth-First Search
print(f"DFS path from {start} to {end} with budget of {budget}...")

# Set all initial to not visited
visited = [False] * (len(adj_list) + 1)
parent = [None] * (len(adj_list) + 1)
stack = deque()

# Push starting node into the stack
stack.append((start, 0))
parent[start] = -1

path = deque()
pathFound = False

while (len(stack)):
    next_node_tup = stack.pop()
    next_node = int(next_node_tup[0])
    energy_used = next_node_tup[1]

    # Found target, proceed to print path
    if (next_node == end):
        pathFound = True    
        break

    # Set to visited if not yet
    # if (not visited[next_node]):
    #     visited[next_node] = True
    
    # Add all connecting nodes to top of stack
    edgeNodes = adj_list[str(next_node)]
    #print(f"There are {len(edgeNodes)} nodes connecting to {next_node}")

    for node in edgeNodes:
        node = int(node)
        #print(f"Node {node} connecting to node {next_node}")

        if (not visited[node]):
            # If taking this node exceeds budget, we dont want it
            minipath = str(next_node) + "," + str(node)
            print(f"Cost to go from {next_node} to {node} is {cost_list[minipath]}")
            energy_if_take = energy_used + cost_list[minipath]
            print(f"Energy used if path taken is {energy_if_take}")
            if (energy_if_take <= budget):
                parent[node] = next_node
                stack.append((node, energy_if_take))
                visited[node] = True
            else:
                print(f"Budget exceeded")
                # ! Need to set backtracked nodes to not visited
                # ! Problem is here!
                visited[node] = False
        
    print(stack)

if pathFound:
    curr = end
    path = []
    distance = 0
    energy = 0
    print("Printing path")
    while(curr != -1):
        path.append(curr)
        if (parent[curr] != -1):
            miniPath = str(parent[curr]) + "," + str(curr)
            distance += dist_list[miniPath]
            energy += cost_list[miniPath]
        curr = parent[curr]

    # Reverse and print path
    path.reverse()
    print(*path, sep = " -> ")
    print(f"Number of jumps is {len(path)}")
    print(f"Distance from {start} to {end} is {distance}")
    print(f"Cost from {start} to {end} is {energy}")
else:
    print("No path that satisfies budget")

Any hints are suggestions would be appreciated, been stuck here for quite a few days :(


Solution

  • Here is a modified version of your code, with a test case, that addresses the issue(s) in your question.

    Assumptions:

    • The graph is directed, with edges as per adj_list;
    • The graph may have cycles (hence the desire to track a prior encounter using visited);
    • We want to traverse the graph using DFS.

    Key logic:

    • Append start node to stack with waiting_for_adj_list flag set to False
    • In the DFS loop, pop a node from stack and either (1) mark it as visited, re-append it to stack with waiting_for_adj_list flag set to True and append its children to stack (subject to detection of a cycle or a broken budget) or (2) reset visited status for the node and its children.
    • Exit early upon reaching end node at or under budget.

    Note that I have taken some liberties in type usage (list instead of deque for stack, dict instead of list for parent and visited, maybe a couple of others) to avoid complexity and/or complications outside the focus of your question.

        def DFS(start, end, budget, adj_list, dist_list, cost_list):
            visited = {start: True}
            parent = {}
            path = []
            pathFound = False
            stack = [(start, 0, False)] #node, energy, waiting_for_adj_list
            while stack:
                next_node, energy_used, waiting_for_adj_list = stack.pop()
                next_node = int(next_node)
                if next_node == end:
                    pathFound = True   
                    print(f"Path found at cost {energy_used}: {path + [next_node]}")
                    break
                if waiting_for_adj_list:
                    # now that we're done with next_node and its children, mark children unvisited
                    edgeNodes = adj_list[str(next_node)]
                    for node in edgeNodes:
                        node = int(node)
                        if parent[node] == next_node:
                            visited[node] = False
                    visited[next_node] = False
                    print(f"done traversing {next_node}, visited {[node for node, v in visited.items() if v]}")
                    path.pop()
                    continue
                stack.append((next_node, energy_used, True)) # append to stack again with waiting_for_adj_list == True
                visited[next_node] = True
                path.append(next_node)
                edgeNodes = adj_list[str(next_node)]
                for node in edgeNodes:
                    node = int(node)
                    if node in visited and visited[node]:
                        # detected a cycle, don't follow it
                        print(f"Cycle detected: {path + [node]}")
                    else:
                        minipath = str(next_node) + "," + str(node)
                        energy_if_take = energy_used + cost_list[minipath]
                        if (energy_if_take <= budget):
                            parent[node] = next_node
                            stack.append((node, energy_if_take, False))
                        else:
                            print(f"Budget {budget} exceeded at cost {energy_if_take} for path {path + [node]}")
                            # node is still available to be explore from other parents, but don't put on stack
                print(f"stack {stack}")
                print(f"path {path}")
                
            if pathFound:
                curr = end
                path = []
                distance = 0
                energy = 0
                print("Printing path")
                while(curr != -1):
                    path.append(curr)
                    if (curr in parent):
                        miniPath = str(parent[curr]) + "," + str(curr)
                        distance += dist_list[miniPath]
                        energy += cost_list[miniPath]
                    curr = parent[curr] if curr in parent else -1
    
                # Reverse and print path
                path.reverse()
                print(*path, sep = " -> ")
                print(f"Number of jumps is {len(path)}")
                print(f"Distance from {start} to {end} is {distance}")
                print(f"Cost from {start} to {end} is {energy}")
            else:
                print("No path that satisfies budget")
                
        # Test case
        start, end = 0, 7
        budget = 200
        adj_list = {
            '0':{1, 2, 3},
            '1':{4}, '2':{4}, '3':{4},
            '4':{5, 6},
            '5':{7},
            '6':{7, 8},
            '7':{},
            '8':{4}
        }
        #4,6,8,4 is a cycle
        cost_list = {
            '0,1':10,
            '0,2':20,
            '0,3':30,
            '1,4':40,
            '2,4':400,
            '3,4':40,
            '4,5':50,
            '4,6':60,
            '5,7':100,
            '6,7':700,
            '6,8':8,
            '8,4':40
        }
        #0,3,4,6,8,4 has a cycle
        #0,3,4,6,7 short-circuits at 830 cost
        #0,3,4,5,7 short-circuits at 220 cost
        #0,2,4 short-circuits at 420 cost
        #0,1,4,6,8,4 is a cycle
        #0,1,4,5,7 costs 200
        dist_list = {
            '0,1':1,
            '0,2':1,
            '0,3':1,
            '1,4':1,
            '2,4':1,
            '3,4':1,
            '4,5':1,
            '4,6':1,
            '5,7':1,
            '6,7':1,
            '6,8':1,
            '8,4':1            
        }
        DFS(start, end, budget, adj_list, dist_list, cost_list)
    

    This code:

    • Extends the tuple on the stack to have a third element waiting_for_adj_list, so when we pop a next_node we immediately append it again with this flag set to True, or if already True we do some clean-up for the children of next_node.
    • Uses visited purely to detect cycles, and clears it away for children as part of the aforementioned clean-up, so that if nodes downstream of next_node are also on other paths for nodes upstream of next_node, they can be traversed and the cost analysis particular to such paths can proceed unimpeded.
    • On detecting a cycle, takes no special action except to break the cycle.

    Test case stdout is:

    stack [(0, 0, True), (1, 10, False), (2, 20, False), (3, 30, False)]
    path [0]
    stack [(0, 0, True), (1, 10, False), (2, 20, False), (3, 30, True), (4, 70, False)]
    path [0, 3]
    stack [(0, 0, True), (1, 10, False), (2, 20, False), (3, 30, True), (4, 70, True), (5, 120, False), (6, 130, False)]
    path [0, 3, 4]
    Budget 200 exceeded at cost 830 for path [0, 3, 4, 6, 7]
    stack [(0, 0, True), (1, 10, False), (2, 20, False), (3, 30, True), (4, 70, True), (5, 120, False), (6, 130, True), (8, 138, False)]
    path [0, 3, 4, 6]
    Cycle detected: [0, 3, 4, 6, 8, 4]
    stack [(0, 0, True), (1, 10, False), (2, 20, False), (3, 30, True), (4, 70, True), (5, 120, False), (6, 130, True), (8, 138, True)]
    path [0, 3, 4, 6, 8]
    done traversing 8, visited [0, 3, 6]
    done traversing 6, visited [0, 3]
    Budget 200 exceeded at cost 220 for path [0, 3, 4, 5, 7]
    stack [(0, 0, True), (1, 10, False), (2, 20, False), (3, 30, True), (4, 70, True), (5, 120, True)]
    path [0, 3, 4, 5]
    done traversing 5, visited [0, 3]
    done traversing 4, visited [0, 3]
    done traversing 3, visited [0]
    Budget 200 exceeded at cost 420 for path [0, 2, 4]
    stack [(0, 0, True), (1, 10, False), (2, 20, True)]
    path [0, 2]
    done traversing 2, visited [0]
    stack [(0, 0, True), (1, 10, True), (4, 50, False)]
    path [0, 1]
    stack [(0, 0, True), (1, 10, True), (4, 50, True), (5, 100, False), (6, 110, False)]
    path [0, 1, 4]
    Budget 200 exceeded at cost 810 for path [0, 1, 4, 6, 7]
    stack [(0, 0, True), (1, 10, True), (4, 50, True), (5, 100, False), (6, 110, True), (8, 118, False)]
    path [0, 1, 4, 6]
    Cycle detected: [0, 1, 4, 6, 8, 4]
    stack [(0, 0, True), (1, 10, True), (4, 50, True), (5, 100, False), (6, 110, True), (8, 118, True)]
    path [0, 1, 4, 6, 8]
    done traversing 8, visited [0, 6, 1]
    done traversing 6, visited [0, 1]
    stack [(0, 0, True), (1, 10, True), (4, 50, True), (5, 100, True), (7, 200, False)]
    path [0, 1, 4, 5]
    Path found at cost 200: [0, 1, 4, 5, 7]
    Printing path
    0 -> 1 -> 4 -> 5 -> 7
    Number of jumps is 5
    Distance from 0 to 7 is 4
    Cost from 0 to 7 is 200
    

    There are undoubtedly optimization opportunities remaining, but based on my understanding of your question and objective as outlined in the assumptions in this answer, I believe this code should find a path within a cost budget using DFS.