Okay so I have a huge array of unsorted elements of an unknown data type (all elements are of the same type, obviously, I just can't make assumptions as they could be numbers, strings, or any type of object that overloads the < and > operators. The only assumption I can make about those objects is that no two of them are the same, and comparing them (A < B) should give me which one should show up first if it was sorted. The "smallest" should be first.
I receive this unsorted array (type std::vector, but honestly it's more of an algorithm question so no language in particular is expected), a number of objects per "group" (groupSize), and the group number that the sender wants (groupNumber).
I'm supposed to return an array containing groupSize elements, or less if the group requested is the last one. (Examples: 17 results with groupSize of 5 would only return two of them if you ask for the fourth group. Also, the fourth group is group number 3 because it's a zero-indexed array)
Example:
Received Array: {1, 5, 8, 2, 19, -1, 6, 6.5, -14, 20}
Received pageSize: 3
Received pageNumber: 2
If the array was sorted, it would be: {-14, -1, 1, 2, 5, 6, 6.5, 8, 19, 20}
If it was split in groups of size 3: {{-14, -1, 1}, {2, 5, 6}, {6.5, 8, 19}, {20}}
I have to return the third group (pageNumber 2 in a 0-indexed array): {6.5, 8, 19}
The biggest problem is the fact that it needs to be lightning fast. I can't sort the array because it has to be faster than O(n log n).
I've tried several methods, but can never get under O(n log n).
I'm aware that I should be looking for a solution that doesn't fill up all the other groups, and skips a pretty big part of the steps shown in the example above, to create only the requested group before returning it, but I can't figure out a way to do that.
You can find the value of the smallest element s
in the group in linear time using the standard C++ std::nth_element
function (because you know it's index in the sorted array). You can find the largest element S
in the group in the same way. After that, you need a linear pass to find all elements x
such that s <= x <= S
and return them. The total time complexity is O(n)
.
Note: this answer is not C++ specific. You just need an implementation of the k-th order statistics in linear time.