Search code examples
calgorithmfrequency

Is there an efficient way to find the most frequent number in a sorted array of ranges in constant space


I have to find the most frequent number (or mode) from a sorted array of ranges. This range consists simply of a start and end number that is inclusive. As an example arr[0] could contain the numbers {0, 3}, which would then mean the numbers {0, 1, 2, 3}. The ranges are sorted by start number, and when start numbers are equal they are sorted by the end number. Both in ascending order.

The array is filled with n of these ranges. I need to find a way to find the number that occurs in the most ranges.

I have thought of a solution that goes through all the numbers once, but this is in O(n * m), where m is the average length of the ranges. This is an horrendously slow algorithm for the size I am dealing with.

I have also looked at some faster solutions such as this, but I can't think of an efficient way of implementing it when you cannot easily find the kth number, because the ranges can be wildly different sizes.

I'm currently trying to implement this in C, but any idea's or links to resources dealing with this are greatly appreciated.


Solution

    1. Convert each ranges {a, b} to two events "add 1 at a" and "subtract 1 at b+1".
    2. Sort the events by ascending order of where to add or subtract. subtract events should be come earlier than add events for same position.
    3. Execute the events and check at what point the result of calculation became highest.

    For example, if we have 3 ranges {0, 5}, {2, 9}, {4, 7}, each range will be like this:

    0123456789
    ------
      --------
        ----
    

    And sorted events are:

    • add 1 to 0
    • add 1 to 2
    • add 1 to 4
    • subtract 1 to 6
    • subtract 1 to 8
    • subtract 1 to 10

    Executing these events, the value becomes the highest 3 after executing the 3rd event and before executing the 4th event. Therefore we can say that the most appering numbers are 4 and 5.

    The time complexity is O(N log N) because:

    • O(N) conversion from ranges to events
    • O(N log N) sorting of events
    • O(N) execution of events

    The space complexity is O(N) for storing the events.