I wrote a simple function to sort an array int a[];
using hash.
For that I stored frequency for every element in new array hash1[]
and then I put back in original array in linear time.
#include<bits/stdc++.h>
using namespace std;
int hash1[10000];
void sah(int a[],int n)
{
int maxo=-1;
for(int i=0;i<n;i++)
{
hash1[a[i]]++;
if(maxo<a[i]){maxo=a[i];}
}
int i=0,freq=0,idx=0;
while(i<maxo+1)
{
freq=hash1[i];
if(freq>0)
{
while(freq>0)
{
a[idx++]=i;freq--;
}
}
i++;
}
}
int main()
{
int a[]={6,8,9,22,33,59,12,5,99,12,57,7};
int n=sizeof(a)/sizeof(a[0]);
sah(a,n);
for(int i=0;i<n;i++)
{
printf("%d ",a[i]);
}
}
This algorithm runs in O(max_element). What kind of disadvantages I'm facing here considering only performance( time and space)?
The algorithm you've implemented is called counting sort. Its runtime is O(n + U), where n is the total number of elements and U is the maximum value in the array (assuming the numbers go from 0 to U), and its space usage is Θ(U). Your particular implementation assumes that U = 10,000. Although you've described your approach as "hashing," this really isn't a hash (computing some function of the elements and using that to put them into buckets) as a distribution (spreading elements around according to their values).
If U is a fixed constant - as it is in your case - then the runtime is O(n) and the space usage is O(1), though remember that big-O talks about long-term growth rates and that if U is large the runtime can be pretty high. This makes it attractive if you're sorting very large arrays with a restricted range of values. However, if the range of values can be large, this is not a particularly good approach. Interestingly, you can think of radix sort as an algorithm that repeatedly runs counting sort with U = 10 (if using the base-10 digits of the numbers) or U = 2 (if going in binary) and has a runtime of O(n log U), which is strongly preferable for large values of U.
You can clean up this code in a number of ways. For example, you have an if
statement and a while
loop with the same condition, which can be combined together into a single while
loop. You also might want to put in some assert checks to make sure all the values are in the range from 0 to 9,999, inclusive, since otherwise you'll have a bounds error. Additionally, you could consider making the global array either a local variable (though watch your stack usage) or a static
local variable (to avoid polluting the global namespace). You could alternatively have the user pass in a parameter specifying the maximum size or could calculate it yourself.