I'm trying to find an answer to an interview question I got recently but wasn't able to solve. I was asked to sort 10 million objects (each contains a 10-bit integer and a unique string value) by the 10-bit integer in place using O(1) space and O(n) time. For all intensive purposes, the string value just makes the problem more difficult, but you don't sort the objects by it.
I was thinking of using counting sort however, that would only work if I was just sorting the 10-bit integers. Sorting the objects makes things a bit more complicated, since I need to keep track of their unique string values. I looked at radix sort, but it appears to use O(n) as worst-case.
Is there some type of sort that I'm missing?
There's an in-place (MSB) radix sort.
It goes something like this:
Move all elements with that bit equal to 0 to the left
and all those with that bit equal to 1 to the right.
You do this similar to quicksort by moving from both sides to the middle, swapping 1's on the left with 0's on the right.
The process goes something like this:
↓ ↓
0... 1...
--------------- ---------------
↓ ↓ ↓ ↓
00... 01... 10... 11...
------- ------- ------- -------
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
000 001 010 011 101 110 110 111
--- --- --- --- --- --- --- ---
...
The time complexity is O(bits*n)
and the space complexity is O(bits)
.
So with a constant number of bits, the time complexity is O(n)
and the space complexity is O(1)
.
It's also possible to do this using counting sort (with the same time and space complexity).
Create a lookup table (mapping index to offset, can be an array) containing all of the above starting points, which will be used to determine where the next element should go.
Now you loop over the array and swap each element that's out of place to the first spot it can go to, then do the same with each element that we swapped, until the current element is where it belongs.
Increase the relevant element's offset in the lookup table with each step.
Note that it's impossible to swap the same element twice, thus this will be linear time.
That all sounds very confusing, so here's an example:
4 5 3 1 2 3 4 4 1 5 4 3 2 3
// counts:
1: 2, 2: 2, 3: 4, 4: 4, 5: 2
// the starting points (offsets) based on the counts:
1 at position 0
2 at position (offset of 1)+(count of 1) = 0+2 = 2
3 at position (offset of 2)+(count of 2) = 2+2 = 4
4 at position (offset of 3)+(count of 3) = 4+4 = 8
5 at position (offset of 4)+(count of 4) = 8+4 = 12
// sorting:
4 5 3 1 2 3 4 4 1 5 4 3 2 3 // out of place, swap with offsets[4] = 8 (offsets[4]++)
^
1 5 3 1 2 3 4 4 4 5 4 3 2 3 // in correct position, moving on (offsets[1]++)
^
1 5 3 1 2 3 4 4 4 5 4 3 2 3 // swap with offsets[5] = 12 (offsets[5]++)
^
1 2 3 1 2 3 4 4 4 5 4 3 5 3 // swap with offsets[2] = 2 (offsets[2]++)
^
1 3 2 1 2 3 4 4 4 5 4 3 5 3 // swap with offsets[3] = 4 (offsets[3]++)
^
1 2 2 1 3 3 4 4 4 5 4 3 5 3 // swap with offsets[2] = 3 (offsets[2]++)
^
1 1 2 2 3 3 4 4 4 5 4 3 5 3 // in correct position, moving on (offsets[1]++)
^
1 1 2 2 3 3 4 4 4 5 4 3 5 3 // in correct position, moving on (offsets[2]++)
^
You get the idea...
Note that the size of the count table and the lookup table are O(max value)
, which is O(1)
if each number contains a fixed number of bits.
And you do a constant amount of work for each element, so that's O(n)
time.