Search code examples
algorithmbig-ocomplexity-theoryinsertion-sort

Does insertion sort have a Θ(n) value?


It is known that insertion sort has a best case running time of n and a worst case running time of n2. In that case, does it have a big theta value?


Solution

  • Short answer, yes, it does. Big-Theta always exists. The questions are: Are you talking about best case, worst case, or average case? and Can you prove it?

    Best case and worst case are not the same thing as Big-O or Big-Theta. Best case has a Big-Theta. Worst case has a different Big-Theta. They're not the same.

    Let me explain the distinction I'm making.

    Case vs. bound

    People doing complexity analysis are usually really lax with their notation. This is a great question where it pays to be precise. Case and bound are distinct, orthogonal concepts. It's crucial not to conflate them.

    • Case: Best case, worst case, average case
    • Bound: Upper bound (O), lower bound (Ω), exact bound (Θ)

    Each of the cases has its own bounds. When you're analyzing an algorithm's complexity you need to first identify which case you're analyzing. Are you looking at the best case performance? The worst case? The average case?

    Then when you analyze that case you can try to identify its upper and lower bounds.

    Loose bounds

    It's important to recognize that there aren't singular upper or lower bounds. There are many of them. For example, let's look at worst case performance of insertion sort. It has many lower bounds, and many upper bounds.

    • It's trivial to establish a lower bound of Ω(1). I mean, every algorithm has a constant lower bound. It's not always a tight lower bound, but it is a lower bound.
    • Similarly, one could start with a really loose upper bound of O(n!). This is how long it'd take if you simply iterated over every permutation of the input list until you ran across a sorted one. It's the world's worst sorting algorithm. Certainly insertion sort performs better than that, right?

    Tight bounds

    Lower and upper bounds set a floor and a ceiling for how well and how badly an algorithm can perform in the best/worst/average case. They're not that useful if they're way too loose, though. The tighter you can make them, the better.

    • Ω(n) would be a tighter lower bound than Ω(1). And it'd be easy to prove Ω(1): a sorting algorithm certainly has to examine every element of the input list, which takes linear time. If we prove that then we know that the worst case performance of insertion sort is not only Ω(1), it's also Ω(n).
    • As you say, it's well known that the worst case performance has an upper bound of O(n2). This is a much tighter upper bound than O(n!).

    Equal bounds

    If you're able to prove that the lower and upper bounds are the same then you get an exact bound, Θ. Big Theta. To do this you need to squeeze the upper and lower bounds together until they meet at the exact right answer.

    • We know that O(n2) is the tightest possible upper bound.
    • To get Θ, then, we'd need to tighten the lower bound to Ω(n2). Proving that is not really the point of this answer, so let's just assume we've done that.

    If we're able to prove that insertion sort's worst case performance is at best Ω(n2) and at worst O(n2) then we know it is exactly Θ(n2).

    Multiple Θs

    Everything I wrote above was in reference to worst case performance. If you wanted to look at the best case performance, or average case, you'd have to repeat all the analysis again for those. You'd have to establish their own upper and lower bounds and tighten them until they're equal.

    If you did that, you'd end up with three answers. Three Big-Thetas.

    • Best case: Θ(n)
    • Worst case: Θ(n2)
    • Average case: Θ(n2)

    In fact, you could even come up with more Big-Thetas. Best, worst, and average case aren't the only cases one can analyze. They're the most common, sure, but I can imagine other ones which would have their own lower and upper bounds.