sorting time time-complexity complexity-theory

Time complexity question, i think it is impossible with current data

The question is :

. Let A be an array of integers in the range {1, ..., n}. Give an O(n) algorithm that gets rid of duplicate numbers, and sorts the elements of A in decreasing order of frequency, starting with the element that appears the most. For example, if A = [5, 1, 3, 1, 7, 7, 1, 3, 1, 3], then the output should be [1, 3, 7, 5].

the thing is, if we want to know how many time each number from 1 to n, appears we need to run of A which his length is m (m = A.length, because its unknow to us).

with bucket-sort , while m = O(n), its possible.

i think there is a problem in the question, because if m = θ(n), or even m = Ω(n).

so basically i think that without classify what m is, its impossible to achive O(n).

if someone know a way to solve this problem i would be glad. thanks

Solution

Sorting is something else. I.e., with the radix sort you can possibly gain O(kn) time which is closer to the linear if k is a constant.

The main concern should be If you can somehow manage to run your overall summation in O(N) time then you will still gain i.e., O(radixSort) + |O(n)| ~ |O(kn+n)| ~ |O(kn)| in the end.

If you think an approach like this. Take the elements as keys of a hashtable and their sums as the values of the hashtable.

foreach (elements-i in the array){
// element is already in the hashTable
if (hashMap.contains(element-i)){
//take its value (its sum) and update it.
hashMap.put (element-i, hashMap.get(elements-i)+ 1);
}

// element-i hasn't been there
// so put it
else {
// key-value
hashMap.put(element-i, 1); // sum is initialized as 1
}
}

This runs in O(N) time (Worst case scenario). In your case the elements are not sufficient enough to generate some collisions during the hashing so in fact hashMap.put , hashMap.contains or hashMap.get run in O(1) time.

Finally. You can just choose any sorting methods to sort the hash table. And whatever the sorting time complexity produced will be the time complexity for this entire process.