Search code examples
pythonpython-3.xalgorithmstructurebreadth-first-search

RunTime Error for BFS to get distance to each node from a given start node


I am doing HackerRank problem BFS Shortest reach:

Consider an undirected graph where each edge weighs 6 units. Each of the nodes is labeled consecutively from 1 to n.

You will be given a number of queries. For each query, you will be given a list of edges describing an undirected graph. After you create a representation of the graph, you must determine and report the shortest distance to each of the other nodes from a given starting position using the breadth-first search algorithm (BFS). Return an array of distances from the start node in node number order. If a node is unreachable, return −1 for that node.

Example

The following graph is based on the listed inputs:

enter image description here

n = 5  // number of nodes
n = 3 // number of edges
edges = [1, 2], [1, 3], [3, 4]
s = 1 // starting node

All distances are from the start node 1. Outputs are calculated for distances to nodes 2 through 5: [6, 6, 12, -1]. Each edge is 6 units, and the unreachable node 5 has the required return distance of −1.

Function Description

Complete the bfs function. If a node is unreachable, its distance is −1.

bfs has the following parameter(s):

  • int n: the number of nodes
  • int m: the number of edges
  • int edges[m][2]: start and end nodes for edges
  • int s: the node to start traversals from

Returns

int[n-1]: the distances to nodes in increasing node number order, not including the start node (-1 if a node is not reachable)

My attempt

def bfs(n, m, edges, s):
    # Write your code here
    # create hashmap, containing adj verticies
    resList = [-1] * n 
    
    # processing the input
    adjMap = {}
    for edge in edges:
        if edge[0] not in adjMap:
            newSet = set()
            adjMap[edge[0]] = newSet
        if edge[1] not in adjMap:
            newSet = set()
            adjMap[edge[1]] = newSet 
        adjMap[edge[0]].add(edge[1])
        adjMap[edge[1]].add(edge[0])
        
    
    
    queue = deque()
    valQueue = deque()
    visited = set()
    # initialise queue
    queue.append(s)
    valQueue.append(0)
    resList[s - 1] = 0
    visited.add(s)

    
    while queue:
        # dequeueNode = queue.pop(0)
        dequeueNode = queue.popleft()
        # dequeueNodeVal = valQueue.pop(0)
        dequeueNodeVal = valQueue.popleft()
        adjSet = adjMap[dequeueNode]
        
        for adj in adjSet:
            if adj not in visited:
                visited.add(adj)
                queue.append(adj)
                valQueue.append(dequeueNodeVal + 6)
                resList[adj - 1] = dequeueNodeVal + 6
    resList.remove(0)
    
    
    
    return resList

Originally my queue was just a list, ergo:

queue = []
queue.append(blah)
queue.pop(0)

I would pop from the start of the list, which I know is inefficient, as it requires reindexing. Anyway, I changed it to the deque module from collections as I know that's more efficient.

Problem

My solution fails one of the six tests. It seems to encounter a runtime error for very large input sizes. Hacker Rank just displays:

Compiler message: Runtime error

The input consists of 6 test cases, each representing large graphs (like with 30000 edges...).

Runtime error pic

I have optimised everything I feel that I can: I'm using a linear running algorithm. The only thing that I think could be messing up my time efficiency is maybe the processing input section where I put all my stuff inside a hash map. However, that is just O(edges) which is the size of the edges array given.

The output that is expected is large, and I cannot see how I can effectively debug with such large inputs/outputs, and I could not reproduce a problem with small inputs:

what output I need

What am I missing? I got a coding interview tomorrow...


Solution

  • The problem occurs when s is a node that has no edges. In that case your adjMap will have no entry for s and a runtime error will occur in the first iteration of your main loop, at this statement:

    adjSet = adjMap[dequeueNode]
    

    Once you realise this, it is trivial to solve this, and there are many ways to do it. For instance, you could replace this statement:

    adjMap = {}
    

    with this one:

    adjMap = {key: set() for key in range(n + 1)}
    

    ...and then you can also drop the if statements in the first loop.

    Other improvements

    With the above change it will work, but I'd like to take the opportunity to suggest some other changes:

    • adjMap could be a list instead of a dictionary, using the same principle as you used for resList
    • You can omit visited, and instead check whether resList has a -1 at the node's index. If so, the node was not visited yet
    • You can omit the second deque, because that value can be retrieved from resList.
    • Instead of .remove(0), you could call pop in order to remove that same start-node entry.
    • You could translate node numbers to 0-based indexes from the start, so that all your indexing in the main loop will be without this minus-1 correction.
    • Avoid adding 6 in each iteration of the loop. Do this at the moment you get the value before the loop
    • The neighbors collection does not really have to be a set, it can be a simple list, so that the adjacency list is a 2D list.

    That results in this code:

    def bfs(n, m, edges, s):
        resList = [-1] * n
        
        # Create the adjacency list as a 2D list and initialise all entries
        adjMap = [[] for _ in range(n)]
        # Populate the adjacency list, and convert immediately to 0-based indexing
        for a, b in edges:
            adjMap[a - 1].append(b - 1)
            adjMap[b - 1].append(a - 1)
        s -= 1  # ...also the start index
            
        queue = deque((s, ))
        resList[s] = 0
        
        while queue:
            dequeueNode = queue.popleft()
            # Read the value from our result list and immediately add the 6
            distance = resList[dequeueNode] + 6
            
            for adj in adjMap[dequeueNode]:
                if resList[adj] < 0:  # This value can serve for "visited" detection
                    queue.append(adj)
                    resList[adj] = distance
        resList.pop(s)  # No need to search for a 0. We know where it is.
        return resList