I'm trying to help a friend analyze the complexity of his algorithm but my understanding of Big-O notation is quite limited.
The code goes like this:
int SAMPLES = 2000;
int K_SAMPLES = 5000;
int i = 0; // initial index position
while (i < SAMPLES)
{
enumerate(); // Complexity: O(SAMPLES)
int neighbors = find_neighbors(i); // Complexity: O(1)
// Worst case scenario, neighbors is the same number of SAMPLES
int f = 0;
while (f < neighbors) // This loop is probably O(SAMPLES) as well.
{
int k = 0; // counter variable
while (k < K_SAMPLES) // Not sure how to express the complexity of this loop.
{ // Worst case scenario K_SAMPLES might be bigger than SAMPLES.
// do something!
k++;
}
f++;
}
i++;
}
There are 2 functions inside the code but I was able to identify their complexity since they are simple. However, I was unable to express the complexity of the inner while
loop, but even after it is measured, I still need help to assemble all these complexities into a formula that represents the computational complexity of the algorithm.
I seriously need help on this matter. Thanks!
Worst case analysis going from inner most loop to outer most (with mild abuse of the "=" sign):
-> O(K_SAMPLES) -- complexity of just the k-loop
-> neighbors * O(K_SAMPLES) -- complexity of f-loop accounted for
= SAMPLES * O(K_SAMPLES) -- since neighbors = SAMPLES in worst case
= O(SAMPLES * K_SAMPLES)
-> O(SAMPLES) + O(SAMPLES * K_SAMPLES) -- adding complexity of enumerate()
= O(SAMPLES + SAMPLES * K_SAMPLES)
= O(SAMPLES * K_SAMPLES)
The SAMPLES
term was dropped since SAMPLES * K_SAMPLES
grows asymptotically faster. More formally,
When C >= 2, SAMPLES >= 1, K_SAMPLES >= 1 then
SAMPLES + SAMPLES * K_SAMPLES <= C(SAMPLES * K_SAMPLES)
SAMPLES * (K_SAMPLES + 1) <= SAMPLES * C * K_SAMPLES
K_SAMPLES + 1 <= C * K_SAMPLES
For more info on big-O with multiple variables, see here. Continuing on with the last loop we have:
-> SAMPLES * O(SAMPLES * K_SAMPLES) -- complexity of i-loop accounted for
= O(SAMPLES^2 * K_SAMPLES)
Note that depending on the average number returned by find_neighbors(i)
, it can make the average big-O different.