I have two doubts about the counting sort algorithm:
How is the complexity only O(n)
? There are 5 for loops in the algorithm. Should the complexity for each for loop be n? So result in n^4
complexity? I know I am wrong and counting sort is linear, but I want to know why my reasoning is wrong.
If the counting sort algorithm is linear, why is it O(n+K)
and not just O(n)
, if the K
is added it is not linear anymore right?
This is the counting sort code I am referring too:
void countSort(char *str)
{
// The output character array that will have sorted str
char output[strlen(str)];
// Create a count array to store count of inidividul characters and
// initialize count array as 0
int count[RANGE + 1], i;
memset(count, 0, sizeof(count));
// Store count of each character
for(i = 0; str[i]; ++i)
++count[str[i]];
// Change count[i] so that count[i] now contains actual position of
// this character in output array
for (i = 1; i <= RANGE; ++i)
count[i] += count[i-1];
// Build the output character array
for (i = 0; str[i]; ++i)
{
output[count[str[i]]-1] = str[i];
--count[str[i]];
}
// Copy the output array to str, so that str now
// contains sorted characters
for (i = 0; str[i]; ++i)
str[i] = output[i];
}
To answer 1:
Let there be three 'for' loops that each works in O(n)
therefore the overall complexity is of:
O(n)+O(n)+O(n) = O(n+n+n) = O(3n)
in the case of 3n
, the 3
is a constant and can be ignored and thus the complexity diminishes to O(n)
To answer 2:
the algorithm depends not only on the size N
of the array, but also on the range K
of the numbers collected in it. it would require a larger counting array and thus iterations to count sort the array:
{1,10000000,3,2,0} // K=10000001
than it would:
{1,4,5,2,3} // K=6
therefore the complexity the for loops used your code are calculated this way:
for(i = 0; str[i]; ++i) // O(n)
for (i = 1; i <= RANGE; ++i) // O(k)
for (i = 0; str[i]; ++i) // O(n)
for (i = 0; str[i]; ++i) // O(n)
and the overall complexity is that of O(n)+O(n)+O(n)+O(k) = O(n+k)
and to pinpoint my answer to your question: the algorithm complexity is still viewed as linear.