I was trying to estimate the worst case scenario for an algorithm that looks like this (estimated complexity in comments are mine in which V
is the number of Vertices and E
is the number of Edges in a graph):
while(nodes.size()!=0) { // O(V) in which nodes is a LinkedList
Vertex u = Collections.min(nodes); // O(V)
nodes.remove(u); // O(1)
for(Map.Entry<Vertex, Integer> adjacency : u.adj.entrySet()) { // O(E)
// Some O(1) Statement
if(nodes.contains(v)) { // O(V)
// Some O(1) Statement
}
}
}
My question is very straightforward:
After every round in the while
loop, the nodes
LinkedList
will go smaller and smaller.
Eventually, both Collections.min()
and nodes.contains()
operations will take less time every round.
My understanding is Big O Notation always considers the worst, thus the above complexities should be correct.
Otherwise, would you please explain the plot of how to figure out the correct Complexity in the above scenario?
nodes.contains
has worst-case time complexity in Θ(V)
, the for-loop runs a number of times in Θ(E)
and so has worst-case time complexity in Θ(V*E)
, Collections.min
has worst-case time complexity in Θ(V)
, so the body of the while loop has worst-case time complexity in Θ(V+V*E)
, but V+V*E
is itself Θ(V*E)
(see later), so the body of the while loop has worst-case time complexity in Θ(V*E)
. The while loop executes V
times. So the worst-case time to execute the algorithm is in Θ(V^2*E)
.
The simplification there -- replacing Θ(V+V*E)
with Θ(V*E)
-- is an acceptable one because we are looking at the general case of V>1
. That is, V*E
will always be a bigger number than V
, so we can absorb V
into a boundedly constant factor. It would also be correct to say the worst-cast time is in Θ(V^2+E*V^2)
, but one wouldn't use that since the simplified form is more useful.
Incidentally, as a matter of intuition, you can generally ignore the effect of containers being "used up" during an algorithm, such as with insertion sort having fewer and fewer items to look through, or this algorithm having fewer and fewer nodes to scan. Those effects turn into constant factors and disappear. It's only when you're eliminating an interesting number of elements each time, such as with the quickselect algorithm or binary search, that that sort of thing starts to affect the asymptotic runtime.