Search code examples
algorithm

Find the mth largest element


Given an array of n integers, the values in array are 1 to n in any order.

Given another integer m as input.

Now pick the first m items from array (a sub-array of above array) and find the mth largest element from this sub-array.

Next pick the first m+1 items from array and find the mth largest element from this sub-array

so on.. till sub-array holds all n items.

Example:

n = 4
arr = [4,2,1,3]
m = 2

Result:

[2,2,3]

Explanation:

a) m items = [4,2] , mth largest = 2nd largest = 2
b) m+1 items = [4,2,1], 2nd largest = 2
c) m+2 items = [4,2,1,3], 2nd largest = 3

so result is [2,2,3]

This is my code:

static List<Integer> solve(List<Integer> arr, int m) {
   List<Integer> result = new ArrayList<>();
   int n = arr.size();
   for(int i=m; i<=n; i++) {
    List<Integer> sub = arr.subList(0, i);
    Collections.sort(sub, Collections.reverseOrder());
    result.add(sub.get(m-1));
   }
   return result;
}

I want to optimize it so it takes less time to run, what is the correct approach to solve this?


Solution

  • Linear:

    1. create a length n+1 bitvector and set the m bits corresponding to the first m integers. keep track of the smallest integer you've seen so far and set i to this value, and put i in your answer array.

    2. To process an integer k:

      • 2a) If k < i, ignore it.
      • 2b) if k > i, set the k'th bit of your bitvector, and increment i until it reaches another set bit.
      • 2c) append i to your answer.

    This is linear because we increment at most n times across the algorithm.

    Example:

    n = 4
    arr = [4,2,1,3]
    m = 2
    
    step, bitvector, answer, notes
    0     10100      [2]     The final 0 in 10100 is the 0th index, which we ignore.
    1     10100      [2, 2]     1 is smaller than 2, so it doesn't affect our answer.
    2     11100      [2, 2, 3]  3 is bigger than 2, so we increment i until we hit another set bit, in this case 3.
    

    Ruby Code: This function takes in n and m, generates a shuffled array of 1-n, and then uses my approach to get the output you're looking for in linear time.

    
        def f(n, m)
          # We'll use two indices for this:
          #   i: we use i to parse the whole array
          #   j: we use j for the mth largest element
          
          # create an array by shuffling 1-n
          arr = 1.upto(n).to_a.shuffle!
          # initialize variables
          ans = []
          bitvec = 0
          j = arr[0]
          
          # parse the first m elts of arr
          0.upto(m-1) do |i|
            elt = arr[i]
            bitvec |= 1 << elt  # set the bit that matches the elt
            if elt < j
              j = elt
            end
          end
          ans.append(j)
          
          # parse the rest of the array
          m.upto(n-1) do |i|
            elt = arr[i]
            if elt > j
              # set the corresponding bit
              bitvec |= 1 << elt
              
              # increment j until we encounter the next set bit
              loop do
                # increment j
                j += 1
        
                # if the new j'th bit is set
                break if bitvec & 1 << j > 0
              end 
            end
            ans.append(j)
          end
          
          return "input arr = #{arr.to_s}; ans = #{ans.to_s}"
        end
    
    

    Here's some sample output:

    > f(10,4)
    => "input arr = [3, 2, 8, 5, 1, 10, 7, 9, 6, 4]; ans = [2, 2, 3, 5, 7, 7, 7]"
    > f(10,4)
    => "input arr = [3, 8, 4, 10, 1, 2, 7, 9, 6, 5]; ans = [3, 3, 3, 4, 7, 7, 7]"
    > f(10,4)
    => "input arr = [8, 2, 7, 9, 3, 10, 1, 6, 4, 5]; ans = [2, 3, 7, 7, 7, 7, 7]"
    > f(10,4)
    => "input arr = [8, 4, 6, 3, 1, 9, 10, 2, 5, 7]; ans = [3, 3, 4, 6, 6, 6, 7]"
    > f(10,4)
    => "input arr = [3, 9, 2, 7, 10, 4, 8, 1, 5, 6]; ans = [2, 3, 4, 7, 7, 7, 7]"
    

    My answer is linear time and linear space, or more specifically, O(n) time and O(n) space.

    We can get it down to O(n) time and O(m) space by using a set instead of a bitvector, and after initialization, only keeping the m largest elements in the set. Here's code to do that (in Ruby):

    
        def g(n, m, arr = nil)
          # We'll use two indices for this:
          #   i: we use i to parse the whole array
          #   j: we use j for the mth largest element
          
          # create an array by shuffling 1-n
          arr = 1.upto(n).to_a.shuffle! if arr.nil?
          # initialize variables
          ans = []
          s = Set.new()
          j = arr[0]
          
          # parse the first m elts of arr
          0.upto(m-1) do |i|
            elt = arr[i]
            s.add(elt)
            if elt < j
              j = elt
            end
          end
          ans.append(j)
          
          # parse the rest of the array
          m.upto(n-1) do |i|
            elt = arr[i]
            if elt > j
              # set the corresponding bit
              s.add(elt)
              s.delete(j)
              
              # increment j until we encounter the next set bit
              loop do
                # increment j
                j += 1
        
                # if the new j'th bit is set
                break if s.include?(j)
              end 
            end
            ans.append(j)
          end
          
          puts "input arr = #{arr.to_s}; ans = #{ans.to_s}"
        end
        
        def h(n, m)
          arr = 1.upto(n).to_a.shuffle!
          f(n, m, arr)
          g(n, m, arr)
        end