algorithm time-complexity nested-loops big-o

What is the time complexity of the following nested loops in Big-Theta notation?

for (int i = 1; i < n; i++)
    for (int j = i; j < n; j *= 2)
        for (int k = j; k < n; k *= 2);

I know the time complexity is O(n.log²(n)), but I want it in Big-Theta notation and would like to know how to prove this is indeed the time complexity.

Solution

Big Theta is the notation for when the time complexity is the same in the worst and best case.

As this is the situation you have at hand, you can just write what you have in big O in Big Theta.

You would have a difference in worst/best case when there is other, unknown data involved, like an array of size 𝑛, but with values in it that you do not know before hand. In that case an algorithm may have a different running time depending on that data. Another example is when you have some random generator involved (which is just another way to get unknown data).

But as you don't have any unknown data in your algorithm, the time complexity can be written with Big Theta notation.

Proof of the time complexity

The complexity however is not Θ(𝑛log²𝑛). It is in fact a plain Θ(𝑛).

The time complexity can be derived from the number of increments made to the count variable in this extended code, because for every iteration of the outer loop, the middle one is expected at least once, and for every iteration of the middle one, the innermost loop is also executed at least once:

for (i = 1; i < n; i++) {
    for (j = i; j < n; j *= 2)
        for (k = j; k < n; k *= 2)
            count++;

The inner two loops make the same number of increments as in this alternative (pseudo) code, where log stands for the logarithm with base 2:

for (j = log(i); j < log(n); j++)
    for (k = j; k < log(n); k++)
        count++;

Or also:

for (j = 0; j < log(n) - log(i); j++)
    for (k = j; k < log(n) - log(i); k++)
        count++;

Define 𝑚 as the number of iterations made by the loop on 𝑗 for a given 𝑖, then 𝑚 is log𝑛−log𝑖 rounded upward to the closest integer. The number of increments made for a given 𝑖 corresponds to the number of ways to pick 2 values among 𝑚 values, where the second is greater or equal to the first. This number is 𝑚(𝑚+1)/2.

Removing the upward rounding, we can find an upper bound for 𝑚, by adding one to log𝑛−log𝑖. Let's redefine 𝑚 as this upper bound:

𝑚 = log𝑛 − log𝑖 + 1

Then we get this upper bound for the number of count increases made by the inner two loops for a given 𝑖:

½𝑚(𝑚+1)

Breaking this down into the logarithm components, we get:

½(log𝑛 − log𝑖 + 1)(log𝑛 − log𝑖 + 2)
= ½log²𝑛 + (3/2)log𝑛 + 1 + (−log𝑛−3/2)log𝑖 + ½log²𝑖

Adding to this the effect of the outer loop, we get this upper bound for the overall count:

count < ½𝑛log²𝑛 + (3/2)𝑛log𝑛 + 𝑛 + (−log𝑛−3/2)Σ_{_𝑖}log𝑖 + ½Σ_{_i}log²𝑖

The sums over 𝑖 need to be resolved. The sum of logarithms is the logarithm of the product, so we have this equality:

Σ_{_𝑖}log𝑖 = log(𝑛!)

The 𝑛 comes from the number of iterations of the outer loop. In fact, the outer loop does not iterate that many times, because its last iteration is with 𝑖 = n−1, but for the purpose of defining an upper bound, we can choose to take 𝑛 on board as well.

By applying the Stirling formula with conversion of some natural logarithms (noted here as ln) to 2-based logarithms, we get this useful asymptotical approximation:

log(𝑛!) = (𝑛+½)log𝑛 − 𝑛/ln2 + ½log(2π) + O(1/𝑛)
= 𝑛log𝑛 + (−1/ln2)𝑛 + ½log𝑛 + ½log(2π) + O(1/𝑛)

For the last term with a sum, Σ_{_𝑖}log²𝑖, we can use the Euler-Maclaurin formula, and with the integral over this function (see also "sum of log squared terms"), we derive:

Σ_{_𝑖}log²𝑖 = 𝑛[log²𝑛 − (2/ln2)log𝑛 + 2/ln²2] − 2/ln²2 + ½log²𝑛 + O(log𝑛/𝑛)
= 𝑛log²𝑛 − (2/ln2)𝑛log𝑛 + (2/ln²2)𝑛 + (−2)/ln²2 + ½log²𝑛 + O(log𝑛/𝑛)

Taking this into what we had for count:

count < ½𝑛log²𝑛 + (3/2)𝑛log𝑛 + 𝑛
+ (−log𝑛−3/2)Σ_{_𝑖}log𝑖
+ ½Σ_{_𝑖}log²𝑖

We get:

count < ½𝑛log²𝑛 + (3/2)𝑛log𝑛 + 𝑛
+ (−log𝑛−3/2)[ 𝑛log𝑛 + (−1/ln2)𝑛 + ½log𝑛 + ½log(2π) ]
+ ½[ 𝑛log²𝑛 − (2/ln2)𝑛log𝑛 + (2/ln²2)𝑛 + (−2)/ln²2 + ½log²𝑛]
+ O(log𝑛/𝑛)

= ½𝑛log²𝑛 + (3/2)𝑛log𝑛 + 𝑛
+ (−𝑛−½)⋅log𝑛 + (1/ln2)𝑛log𝑛 + (−½)log(2π)log𝑛 + (−3𝑛/2−3/4)log𝑛 + (3/(2ln2))𝑛 + (−3/4)log(2π)
+ ½𝑛log²𝑛 − (1/ln2)𝑛log𝑛 + (1/ln²2)𝑛 + (−1)/ln²2 + (1/4)log²𝑛
+ O(log𝑛/𝑛)

Bringing all terms together that have the same significance in terms of 𝑛, it turns out that the terms with factor 𝑛log²𝑛 and 𝑛log𝑛 eliminate each other. And so we get:

count < (1 + 3/(2ln2) + 1/ln²2)𝑛
+ (−1/4)log²𝑛 + ((−1/2)log(2π)−3/4)log𝑛
+ (−3/4)log(2π) + (−1)/(ln²2)
+ O(log𝑛/𝑛)

So this is ... Θ(𝑛), and as (1 + 3/(2ln2) + 1/ln²2) is about 5.24 and the few next expressions have negative coefficients, the following upper bound exists for count:

count < 6𝑛

This is an upper limit. It cannot be better than Θ(𝑛) since the outer loop is executed Θ(𝑛) times.

Testing it

Empirically you can see that the expression count/n for large 𝑛 converges towards the value 4. Here is a little snippet that keeps increment 𝑛, performs the algorithm and outputs this fraction:

function f(n) {
    var i, j, k, count = 0;
    for (i = 1; i <= n; i++) {
        for (j = i; j < n; j *= 2)
            for (k = j; k < n; k *= 2)
                count++;
    }
    return count;
}

// keep increasing n forever, and print f(n)/n
var n = 2;
setInterval(function () {
    document.body.textContent = 'n = ' + n 
                              + '\nf(n) = ' + f(n)
                              + '\nf(n)/n = ' + (f(n)/n);
    n = n + 1;
}, 1);

body { white-space: pre }

NB: I have the feeling that there must be a more concise and elegant proof for this, but this is what I could come up with for now.