Search code examples
algorithmtime-complexitynested-loopsbig-o

What is the time complexity of the following nested loops in Big-Theta notation?


for (int i = 1; i < n; i++)
    for (int j = i; j < n; j *= 2)
        for (int k = j; k < n; k *= 2);

I know the time complexity is O(n.log2(n)), but I want it in Big-Theta notation and would like to know how to prove this is indeed the time complexity.


Solution

  • Big Theta is the notation for when the time complexity is the same in the worst and best case.

    As this is the situation you have at hand, you can just write what you have in big O in Big Theta.

    You would have a difference in worst/best case when there is other, unknown data involved, like an array of size ๐‘›, but with values in it that you do not know before hand. In that case an algorithm may have a different running time depending on that data. Another example is when you have some random generator involved (which is just another way to get unknown data).

    But as you don't have any unknown data in your algorithm, the time complexity can be written with Big Theta notation.

    Proof of the time complexity

    The complexity however is not ฮ˜(๐‘›logยฒ๐‘›). It is in fact a plain ฮ˜(๐‘›).

    The time complexity can be derived from the number of increments made to the count variable in this extended code, because for every iteration of the outer loop, the middle one is expected at least once, and for every iteration of the middle one, the innermost loop is also executed at least once:

    for (i = 1; i < n; i++) {
        for (j = i; j < n; j *= 2)
            for (k = j; k < n; k *= 2)
                count++;
    

    The inner two loops make the same number of increments as in this alternative (pseudo) code, where log stands for the logarithm with base 2:

    for (j = log(i); j < log(n); j++)
        for (k = j; k < log(n); k++)
            count++;
    

    Or also:

    for (j = 0; j < log(n) - log(i); j++)
        for (k = j; k < log(n) - log(i); k++)
            count++;
    

    Define ๐‘š as the number of iterations made by the loop on ๐‘— for a given ๐‘–, then ๐‘š is log๐‘›โˆ’log๐‘– rounded upward to the closest integer. The number of increments made for a given ๐‘– corresponds to the number of ways to pick 2 values among ๐‘š values, where the second is greater or equal to the first. This number is ๐‘š(๐‘š+1)/2.

    Removing the upward rounding, we can find an upper bound for ๐‘š, by adding one to log๐‘›โˆ’log๐‘–. Let's redefine ๐‘š as this upper bound:

    ย  ย  ย  ๐‘š = log๐‘› โˆ’ log๐‘– + 1

    Then we get this upper bound for the number of count increases made by the inner two loops for a given ๐‘–:

    ย  ย  ย  ยฝ๐‘š(๐‘š+1)

    Breaking this down into the logarithm components, we get:

    ย  ย  ย  ย  ยฝ(log๐‘› โˆ’ log๐‘– + 1)(log๐‘› โˆ’ log๐‘– + 2)
    ย  ย  ย = ยฝlogยฒ๐‘› + (3/2)log๐‘› + 1 + (โˆ’log๐‘›โˆ’3/2)log๐‘– + ยฝlogยฒ๐‘–

    Adding to this the effect of the outer loop, we get this upper bound for the overall count:

    ย  ย  ย  count < ยฝ๐‘›logยฒ๐‘› + (3/2)๐‘›log๐‘› + ๐‘› + (โˆ’log๐‘›โˆ’3/2)ฮฃ๐‘–log๐‘– + ยฝฮฃilogยฒ๐‘–

    The sums over ๐‘– need to be resolved. The sum of logarithms is the logarithm of the product, so we have this equality:

    ย  ย  ย  ฮฃ๐‘–log๐‘– = log(๐‘›!)

    The ๐‘› comes from the number of iterations of the outer loop. In fact, the outer loop does not iterate that many times, because its last iteration is with ๐‘– = nโˆ’1, but for the purpose of defining an upper bound, we can choose to take ๐‘› on board as well.

    By applying the Stirling formula with conversion of some natural logarithms (noted here as ln) to 2-based logarithms, we get this useful asymptotical approximation:

    ย  ย  ย  ย  log(๐‘›!) = (๐‘›+ยฝ)log๐‘› โˆ’ ๐‘›/ln2 + ยฝlog(2ฯ€) + O(1/๐‘›)
    ย  ย  ย = ๐‘›log๐‘› + (โˆ’1/ln2)๐‘› + ยฝlog๐‘› + ยฝlog(2ฯ€) + O(1/๐‘›)

    For the last term with a sum, ฮฃ๐‘–logยฒ๐‘–, we can use the Euler-Maclaurin formula, and with the integral over this function (see also "sum of log squared terms"), we derive:

    ย  ย  ย  ย  ฮฃ๐‘–logยฒ๐‘– = ๐‘›[logยฒ๐‘› โˆ’ (2/ln2)log๐‘› + 2/lnยฒ2] โˆ’ 2/lnยฒ2 + ยฝlogยฒ๐‘› + O(log๐‘›/๐‘›)
    ย  ย  ย = ๐‘›logยฒ๐‘› โˆ’ (2/ln2)๐‘›log๐‘› + (2/lnยฒ2)๐‘› + (โˆ’2)/lnยฒ2 + ยฝlogยฒ๐‘› + O(log๐‘›/๐‘›)

    Taking this into what we had for count:

    ย  ย  ย  count < ยฝ๐‘›logยฒ๐‘› + (3/2)๐‘›log๐‘› + ๐‘›
    ย  ย  ย + (โˆ’log๐‘›โˆ’3/2)ฮฃ๐‘–log๐‘–
    ย  ย  ย + ยฝฮฃ๐‘–logยฒ๐‘–

    We get:

    ย  ย  ย  count < ยฝ๐‘›logยฒ๐‘› + (3/2)๐‘›log๐‘› + ๐‘›
    ย  ย  ย + (โˆ’log๐‘›โˆ’3/2)[ ๐‘›log๐‘› + (โˆ’1/ln2)๐‘› + ยฝlog๐‘› + ยฝlog(2ฯ€) ]
    ย  ย  ย + ยฝ[ ๐‘›logยฒ๐‘› โˆ’ (2/ln2)๐‘›log๐‘› + (2/lnยฒ2)๐‘› + (โˆ’2)/lnยฒ2 + ยฝlogยฒ๐‘›]
    ย  ย  ย + O(log๐‘›/๐‘›)

    ย  ย  ย  = ยฝ๐‘›logยฒ๐‘› + (3/2)๐‘›log๐‘› + ๐‘›
    ย  ย  ย + (โˆ’๐‘›โˆ’ยฝ)โ‹…log๐‘› + (1/ln2)๐‘›log๐‘› + (โˆ’ยฝ)log(2ฯ€)log๐‘› + (โˆ’3๐‘›/2โˆ’3/4)log๐‘› + (3/(2ln2))๐‘› + (โˆ’3/4)log(2ฯ€)
    ย  ย  ย + ยฝ๐‘›logยฒ๐‘› โˆ’ (1/ln2)๐‘›log๐‘› + (1/lnยฒ2)๐‘› + (โˆ’1)/lnยฒ2 + (1/4)logยฒ๐‘›
    ย  ย  ย + O(log๐‘›/๐‘›)

    Bringing all terms together that have the same significance in terms of ๐‘›, it turns out that the terms with factor ๐‘›logยฒ๐‘› and ๐‘›log๐‘› eliminate each other. And so we get:

    ย  ย  ย  count < (1 + 3/(2ln2) + 1/lnยฒ2)๐‘›
    ย  ย  ย + (โˆ’1/4)logยฒ๐‘› + ((โˆ’1/2)log(2ฯ€)โˆ’3/4)log๐‘›
    ย  ย  ย + (โˆ’3/4)log(2ฯ€) + (โˆ’1)/(lnยฒ2)
    ย  ย  ย + O(log๐‘›/๐‘›)

    So this is ... ฮ˜(๐‘›), and as (1 + 3/(2ln2) + 1/lnยฒ2) is about 5.24 and the few next expressions have negative coefficients, the following upper bound exists for count:

    ย  ย  ย  count < 6๐‘›

    This is an upper limit. It cannot be better than ฮ˜(๐‘›) since the outer loop is executed ฮ˜(๐‘›) times.

    Testing it

    Empirically you can see that the expression count/n for large ๐‘› converges towards the value 4. Here is a little snippet that keeps increment ๐‘›, performs the algorithm and outputs this fraction:

    function f(n) {
        var i, j, k, count = 0;
        for (i = 1; i <= n; i++) {
            for (j = i; j < n; j *= 2)
                for (k = j; k < n; k *= 2)
                    count++;
        }
        return count;
    }
    
    // keep increasing n forever, and print f(n)/n
    var n = 2;
    setInterval(function () {
        document.body.textContent = 'n = ' + n 
                                  + '\nf(n) = ' + f(n)
                                  + '\nf(n)/n = ' + (f(n)/n);
        n = n + 1;
    }, 1);
    body { white-space: pre }

    NB: I have the feeling that there must be a more concise and elegant proof for this, but this is what I could come up with for now.