Search code examples
algorithmsortingtime-complexitybubble-sort

Calculate Big Theta Notation Bubble Sort


I'm trying to calculate the theta notation for bubble sort but I'm getting stuck. Given this procedure (pseudocode):

procedure BUBBLE_SORT(A,n) {
    array A(1 to n)
    for (int i = 1; i <= n; i++) {
        for(int j = 1; j <= n-1; j++) {
            if(A[j] > A[j+1] {
                //swap(A(j), A(j+1))
            }
        }
    }
}

I was able to get the worst case running time with sigma notation to be:

(n^2 - n) / 2

For the best case running time, I followed my book and did this:

Given p(n) = (n^2 - n) / 2, we claim that p(n) = Θ(n^2). To prove this we show that for some constants c1, c2 and n0:

C1n^2 <= (n^2 / 2) + (n / 2) <= C2n^2

Dividing both sides by n^2, we get:

C1 <= (1 / 2) + (1 / 2n) <= C2

This is where I got lost. In the book, the author picked out some numbers, plugged them in and said "Therefore it follows that p(n) = Θ(n^2)"

How do I know what numbers to plug in? Can I just plug in any numbers? And if those numbers do fit the inequality, does that mean I can immediately say that the algorithm is Θ(n^2)?

Thank you!


Solution

  • Remember that this is about asymptotic behavior.

    We can think about it as playing a game: Your goal is to prove that a certain bound holds. The game goes like this: First, you get to chose the constants C1 and C2. Then I get to chose an arbitrary n. If I can do this in a way that it violates the inequality, you lose (the bound does not hold). If there is no way I can do this, even though you are no longer allowed to change the C1 and C2 that you picked, you win (the bound is correct).

    Let's look at the equation in question now:

    C1 <= (1 / 2) + (1 / 2n) <= C2

    Since n is an integer (in the end, it represents the number of elements in your array), there is a definite smallest value: n = 1. Let's substitute that in:

    (1 / 2) + (1 / 2) = 1

    Alright, now that's a start. Let's see what happens as I make n bigger... Note that the n only appears in the denominator. So I cannot mess up your bound by making it arbitrary big. The worst case for you is actually for small values of n. With bigger values of n the second part of the product gets smaller and eventually disappears. For the limit n->inf we get:

    lim n->inf (1 / 2) + (1 / 2n)

    (1 / 2) + lim n->inf (1 / 2n) = (1 / 2) + 0 = 1 / 2

    So what that tells us is, whichever value of n I choose, the resulting value will always be in the range 1/2 to 1 (since the equation is linear in n these two samples are enough to show that; with more complex equations, establishing the bounds usually requires a little more work).

    With this knowledge, can you pick C1 and C2 in a way that I can never win the game?