Search code examples
pythonalgorithmgraphshortest-pathdijkstra

Difficulties with total cost of graph in Dijkstra algorithm in python


I'm currently trying to make an algorithm that gives me the total cost of a graph when all nodes have been visited but am failing miserably and honestly out of ideas. My goal is to get the total costs of the graphs below, using the Dijkstra algorithm.

Here's what I have so far:

from collections import defaultdict
import heapq

def build_graph():
    # Create the graph with the all given nodes and costs
    edges = aeroDistances
    graph = defaultdict(dict)

    for edge in edges.items():
        tuple1 = edge[0][0]
        tuple2 = edge[0][1]
        cost = edge[1]

        connection1 = {tuple2 : cost}
        connection2 = {tuple1 : cost}

        graph[tuple1].update(connection1)
        graph[tuple2].update(connection2)
    
    return dict(graph)

def dijkstra(graph, starting_vertex):
    # All the distances set to infinity
    distances = {vertex: float('infinity') for vertex in graph}
    # Distance from the starting vertex
    distances[starting_vertex] = 0
    # Priority queue
    pq = [(0, starting_vertex)]

    while len(pq) > 0:
        current_distance, current_vertex = heapq.heappop(pq)

        # Nodes can get added to the priority queue multiple times. We only
        # process a vertex the first time we remove it from the priority queue
        if current_distance > distances[current_vertex]:
            continue

        for neighbor, weight in graph[current_vertex].items():
            distance = current_distance + weight
            # Only consider this new path if it's better than any path we've
            # already found
            if distance < distances[neighbor]:
                distances[neighbor] = distance
                heapq.heappush(pq, (distance, neighbor))

    return distances, distance

numCidades = 0
numAeroportos = 0
numEstradas = 0
#custoAeroportos = {}
#custoEstradas = {}

#custoAeroportos = {(1, 2): 2, (1, 3): 4}
#custoEstradas = {(3, 1, 'E'): 2}

#custoAeroportos = {(1, 2): 1, (2, 3): 2, (3, 4): 1, (4, 1): 1}
custoAeroportos = {(1, 2): 1, (1, 3): 6, (2, 4): 2, (3, 4): 2}
custoEstradas = {(2, 3): 3}

listCidades = [1,2,3]
distances = []
indexValue = 0
indexKey = 0
currentIndex = 0

# Deconstruct the dict into a list of keys (tuples)
# Deconstruct the dict into a list of values
# Make it easier to sort the connections by creating a list of tuples and
# their respective weights and zip them toghether
distancesAeroKeys = list(custoAeroportos.keys())
distancesAeroValues = list(custoAeroportos.values())
aeroDistances = dict(map(list, zip(*[distancesAeroKeys, distancesAeroValues])))

print()
print("AeroDistances: " + str(aeroDistances))

graph = build_graph()
print()
print("Graph: " + str(graph))
print()
print("Dijkstra: " + str(dijkstra(graph, 1)))

The two graphs, dicts, I'm currently trying this with are named custoAeroportos and I can't seem to get the total minimum cost when all nodes are visited.

Here're the graphs, they are fairly simple:

This one has a total cost of 5

This one has a total cost of 3

The total cost I'm getting is wrong and I can't figure it out.

For the first graph:

AeroDistances: {(1, 2): 1, (1, 3): 6, (2, 4): 2, (3, 4): 2}

Graph: {1: {2: 1, 3: 6}, 2: {1: 1, 4: 2}, 3: {1: 6, 4: 2}, 4: {2: 2, 3: 2}}

Dijkstra: ({1: 0, 2: 1, 3: 5, 4: 3}, 7)

For the second graph, which somehow is correct:

AeroDistances: {(1, 2): 1, (2, 3): 2, (3, 4): 1, (4, 1): 1}

Graph: {1: {2: 1, 4: 1}, 2: {1: 1, 3: 2}, 3: {2: 2, 4: 1}, 4: {3: 1, 1: 1}}

Dijkstra: ({1: 0, 2: 1, 3: 2, 4: 1}, 3)

I really appreciate your help, thank you.


Solution

  • Your function returns the distance of the path from the starting vertex to whichever was the last node that was added to the heap. This is not really what you want to return. Certainly when the BFS-tree has multiple outgoing edges from some vertices, this path has little to do with the total distance.

    Instead you need to accumulate the weights of the edges that are "accepted", i.e. those that are (implicitly) popped from the heap and improve the distance for that node.

    So I would suggest extending the tuples on the heap with one more information: the weight of the last edge that brought us to that node. When the node is accepted, then this edge becomes part of the spanning tree, and its weight should then be added to an accumulating total.

    Here is the adapted code. The changes have accompanying comments:

    def dijkstra(graph, starting_vertex):
        distances = {vertex: float('infinity') for vertex in graph}
        distances[starting_vertex] = 0
        graph_distance = 0  # this will be returned
        pq = [(0, 0, starting_vertex)]  # middle value is edge weight
    
        while len(pq) > 0:
            current_distance, edge_weight, current_vertex = heapq.heappop(pq)
            if current_distance > distances[current_vertex]:
                continue
    
            graph_distance += edge_weight  # accumulate
    
            for neighbor, weight in graph[current_vertex].items():
                distance = current_distance + weight
                if distance < distances[neighbor]:
                    distances[neighbor] = distance
                    heapq.heappush(pq, (distance, weight, neighbor))  # include weight
    
        return distances, graph_distance  # ...return it