Search code examples
pythongraphshortest-pathdijkstrapath-finding

Dijkstra Time Complexity Clarification


For Dijkstra implementation that uses a minimum heap priority queue, I set it up as to find a station that does not exist on the network, so that it has to check everything. My question is since the overall time complexity O(V + E log V), why is the time taken for a network to find locate the minimum distance to a station that does not exist on a network with 500 Edges and 400 Stations longer than one with 500 Edges and 500 Stations?

Note: All stations are connected by at least 1 edge. Station with |E| = |S| + 100 has 100 extra edges that are unique but randomly connected

#  PSEUDOCODE

1. INIT QUEUE & DICTIONARY WITH DISTANCE TO SOURCE BEING INF
2. ENQUEUE SOURCE WITH DISTANCE 0
3. SET DISTANCE TO SOURCE FOR SOURCE TO 0
4. LOOP WHILE QUEUE NOT EMPTY
      a. POP STATION FROM QUEUE (CALL IT P)
      b. IF P MARKED AS VISITED THEN RESTART LOOP (CONTINUE).
      c. IF P == TARGET RETURN DIST
      d. LOOP THROUGH ALL ADJACENT STATIONS (A) TO P
          i. IF (A IS NOT MARKED AS VISITED) AND (DIST TO P + DIST(P,A) < DIST TO A)
             1. CHANGE DIST TO A TO BE THE NEW DIST
             2. PUSH A ONTO THE PRIORITY QUEUE.

Graph and Table depicting experimental time complexity


Solution

  • @Arne is right, this question depends a lot on the actual graph used.

    |E| = |S| makes for a very very sparse graph. Let's take an extreme example. Let's say you have |S| = 500, |E| = 500, and all nodes are neatly arranged in a circle. At each iteration, the algorithm goes:

    • Take a single node from the priority queue.
    • Follow the single edge.
    • Add a single node to the PQ.

    All the time, the PQ is running close to empty. The PQ is an optimization over a regular queue, but if there is only one node in the queue, then there is not much to optimize. There is no point of talking about O(log N) when you know N is always 1. Actual performance is much better than the worst-case performance predicted by Big-Oh notation.

    Now add 100 edges. Suddenly the algorithm has choices! The PQ fills up. Let's say the first node examined has 10 edges, and consequently the PQ has a size of 10. It might stay this size for dozens of iterations. The PQ finally has some work to do! This is where the extra time consumption comes from.

    P.S. it's Dijkstra, not Djikstra.