Search code examples
haskellfunctional-programminginfinite-loopdepth-first-search

Haskell Depth-First Graph Traversal has Infinite Loop


I have spent some time recently trying to make a depth-first traversal algorithm in Haskell. However, with my one I would like the function to return a 'visited' list, which has every node visited in order.

Demonstration: with the graph [('A', "BC"), ('B', "ACD"), ('C', "AB"), ('D', "B")] the output should be "ABCBD" since I am starting on A and finishing on D.

Here is my code:

import Data.List (elemIndex)

trav :: [(Char, String)] -> Char -> Char -> String -> String
trav [] _ _ _ = []
trav graph node endNode visited
    | not (elem node visited) = trav graph node endNode (visited ++ [node])  -- repeat but after adding this node to visited list.
    | node == endNode = visited
    | not (null unvisitedNodes) = trav graph (head unvisitedNodes) endNode visited
    | not (null visited) = if (length prevIndex) > 1 then (trav graph (visited !! (getIndex (head prevIndex) visited)) endNode (visited ++ [(visited !! (getIndex (head prevIndex) visited))])) else (trav graph (head (tail (reverse visited))) endNode (visited ++ [(head (tail (reverse visited)))]))
    | otherwise = visited
    where
        unvisitedNodes = [x | x <- getAttachedVertexes graph node, not (elem x visited)]
        prevIndex = [x | x <- (reverse visited), x < (head (reverse visited))]

getAttachedVertexes :: [(Char, String)] -> Char -> [Char]
getAttachedVertexes graph node = case lookup node graph of
    Just value -> value
    Nothing -> ""

getIndex :: Eq a => a -> [a] -> Int
getIndex item list = case elemIndex item list of
    Just index -> index
    Nothing -> error "help"

This code actually works fine and outputs the demonstrated ideal output which I wrote above when inputted with the following: trav [('A', "BC"), ('B', "ACD"), ('C', "AB"), ('D', "B")] 'A' 'D' ""

However, this code starts an infinite recursion loop when I enter the following input: trav [('A', "CDE"), ('B', "D"), ('C', "AD"), ('D', "ABC"), ('E', "A")] 'A' 'E' ""

There is obviously no error message as it's an infinite loop.


My thought process:

I have just edited this question with an updated Haskell code, which is the result of the below thinking:

I first thought that part of the problem was that, when backtracking, the node which was backtracked to was added to the visited list, meaning that when I do head (tail (reverse visited)) the item that returns is actually the same node that is currently being traversed.

I 'translated' this code into Python 3, which is the language I have the most experience in, and kept it in Haskell format, i.e. a function containing a series of if statements all with a return line within them.

Next, I attempted to fix the above error. It returns the correct output for the first input I said above ({"A":["B", "C"], "B":["A", "C", "D"], "C":["A", "B"], "D":["B"]}) however, when I try the second input above ({"A":['C', 'D', 'E'], "B":['D'], "C":['A','D'], "D":['A','B','C'], "E":['A']}) it again starts an infinite loop (well, at least until the stack overflows).

Here is said Python code:

def trav(graph, node, endNode, visited):
    if node not in visited:
        return trav(graph, node, endNode, (visited + [node]))
    if node == endNode:
        return visited
    unvisitedNodes = [x for x in getAttachedNodes(graph, node) if x not in visited]
    if len(unvisitedNodes) > 0:
        return trav(graph, unvisitedNodes[0], endNode, visited)
    if len(visited) > 0:
        prevIndex = [x for x in reversed(visited) if x < visited[-1]]
        if len(prevIndex) > 1:
            return trav(graph, visited[visited.index(prevIndex[0])], endNode, (visited + [visited[visited.index(prevIndex[0])]]))
        else:
            return trav(graph, visited[-2], endNode, (visited + [visited[-2]]))
    return visited


if __name__ == '__main__':
    graph1 = {"A":["B", "C"], "B":["A", "C", "D"], "C":["A", "B"], "D":["B"]}
    node = input("Node: ")
    endNode = input("EndNode: ")
    graph2 = {"A":['C', 'D', 'E'], "B":['D'], "C":['A','D'], "D":['A','B','C'], "E":['A']}
    visited = []
    print(trav(graph2, node, endNode, visited))

I have since re-written the Haskell code, and it now runs exactly the same way in all tested cases as the above Python code. Obviously this only helps those who are capable in Python. I wrote it to hopefully be able to understand the problem. My next edit will be the result of the debugging of this Python code.

I apologise for it's messiness and inefficiency, but I want to mirror as closely as possible the above Haskell code.

Anyway, I included this to hopefully make clearer my algorithm.

In conclusion, the problem was not what I explained I thought it was just above.

Thanks again.


Solution

  • I have just solved this. Here is the updated Haskell code:

    import Data.List (elemIndex)
    
    trav :: [(Char, String)] -> Char -> Char -> String -> String
    trav [] _ _ _ = []
    trav graph node endNode visited
        | not (elem node visited) = trav graph node endNode (visited ++ [node])  -- repeat but after adding this node to visited list.
        | node == endNode = visited
        | not (null unvisitedAttachedNodes) = trav graph (head unvisitedAttachedNodes) endNode visited
        | not (null visited) = if not (null attachedNodesWithUnvisited) then (trav graph (head attachedNodesWithUnvisited) endNode (visited ++ [(head attachedNodesWithUnvisited)])) else (trav graph (head (tail (reverse visited))) endNode (visited ++ [head (tail (reverse visited))]))
        | otherwise = visited
        where
            unvisitedAttachedNodes = [x | x <- getAttachedVertexes graph node, not (elem x visited)]
            prevIndex = [x | x <- (reverse visited), x < (head (reverse visited))]
            unvisitedPrevNodes = [x | x <- prevIndex, elem x [y | y <- (getAttachedVertexes graph x), not (elem y visited)]]
            nodeToGoBackTo = visited !! (getIndex (head (reverse unvisitedPrevNodes)) visited)
            attachedNodesWithUnvisited = [nb | nb <- (getAttachedVertexes graph node), (hasUnvisited graph nb visited)]
    
    getAttachedVertexes :: [(Char, String)] -> Char -> [Char]
    getAttachedVertexes graph node = case lookup node graph of
        Just value -> value
        Nothing -> ""
    
    hasUnvisited :: [(Char, String)] -> Char -> String -> Bool
    hasUnvisited graph node visited = not (null unvisited) where
        unvisited = [x | x <- (getAttachedVertexes graph node), not (elem x visited)]
    
    count :: Eq a => a -> [a] -> Int
    count item list = length (filter (== item) list)
    
    getIndex :: Eq a => a -> [a] -> Int
    getIndex item list = case elemIndex item list of
        Just index -> index
        Nothing -> error "help"
    

    and here is the updated equivalent Python code:

    def trav(graph, node, endNode, visited):
        if node not in visited:
            return trav(graph, node, endNode, (visited + [node]))
        if node == endNode:
            return visited
        unvisitedAttachedNodes = [x for x in getAttachedNodes(graph, node) if x not in visited]
        if len(unvisitedAttachedNodes) > 0:
            return trav(graph, unvisitedAttachedNodes[0], endNode, visited)
        if len(visited) > 0:  # needs to backtrack
            attachedNodesWithUnvisited = [nb for nb in graph[node] if has_unvisited(graph, nb, visited)]
            if len(attachedNodesWithUnvisited) > 0:
                return trav(graph, attachedNodesWithUnvisited[0], endNode, visited + [attachedNodesWithUnvisited[0]])
            else:
                return trav(graph, visited[-2], endNode, visited + [visited[-2]])
        return visited
    

    I realised that the infinite loop was caused by, whenever backtracking was necessary, the program only went back by one node, but then added that node to the visited list, so head (tail (reverse visited)) (or Python equivalent visited[-2] meant the program kept going between two nodes. As you can see clearly in the new Python translation, I am now going to visited[-2] unless there is an attached nodes with unvisited neighbouring nodes.

    this now works with all the graphs I have tested with it.

    Thanks very much for everyone's help!