Search code examples
algorithmbinary-searchgreedy

Maximize the minimum subarray sum out of all subarrays when an array is divided into K subarrays/groups


My doubt is more related to how the above question is framed.

The algorithm used to approach the above problem is Binary Search.

 int low = 0, high = sum(A);
    while(low <= high) {
         int mid = low + (high-low)/2;
         if(numOfSubArrays(A, mid) <= K)
             high = mid-1;
         else
             low = mid+1;
     }
ans = low;

int numOfSubArrays(vector<int>& A, int sum) {
    int total = 0, count = 0;
    for(int num : A) {
       total += num;
       if(total >= sum) {
           total = 0;
           count++;
       }
    }
    return count;
}

I understand the flow of the algorithm in the above context.

If we can divide A into M parts and M < K, it means we need to lower the mid.

If M > K, we need to increase the mid value.

When M = K, we need to lower the mid value to find the minimum sum among all the subarrays.

However, to me it seems like there’ll be only one way to divide A into K parts and the minimum sum for a subarray for that would be the answer.

So how exactly does the term “maximising the minimum fit into the picture” ? What ensures that the minimum sum of the subarray is maximised?


Solution

  • First off, let's start with an example that would test your intuition as to whether there is a single solution for a given A and K.

    For A=[1,2,3,4,5] and K=2, we could have the partitions P and minimum subarray sum v as follows:

    P = [1], [2,3,4,5]  ; v = 1
    P = [1, 2], [3,4,5] ; v = 3
    P = [1,2,3], [4,5]  ; v = 6
    P = [1,2,3,4], [5]  ; v = 5
    

    As you can see, there are multiple ways to partition A into K subarrays, and each may have a different minimum subarray sum value.

    That said, we can now analyze what is going on in this solution. This algorithm presents a common approach. Basically, you have a space of acceptable answers, and you search through that space for a best answer using binary search.

    In this particular problem, the least possible answer is actually the smallest value in the array, but 0 can be used without any issues. The maximum possible answer would be the sum of all the values in the array. (i.e., if K==1) So you go through those possible answers and see if you can partition the array into K subarrays, with the smallest subarray sum being mid, which is the current potential answer you are checking. The space of possible answers is suitable for this binary search strategy, because up to a certain value v, the number of subarrays with sums that are greater than or equal to mid is less than K. And starting with the value v, you can actually have >= K subarrays with sums greater than or equal to mid.

    At each step (i.e., iteration of binary search) you try and form groups so that sum of each group is >= mid. As soon as a group's sum exceed that threshold, you start a new group. So, in a sense, you are correct, because for a particular value of mid, there is only one optimal way to partition the array: make sure each subarray has a sum >= K and as soon as this requirement is satisfied, start partitioning a new subarray from the original array. This is a locally optimal strategy that ensures globally optimal result. (i.e., greedy) But what actually ensures maximizing the minimum subarray sum is the binary search. By iterating over the space of possible answers, and picking the least acceptable value once it is done, binary search ensures that, among the answers that satisfy the given constraints, the smallest one (i.e., v) is picked.

    If you are still having a hard time digesting the strategy of this algorithm, try to consider it from a different perspective. It isn't the partitioning you're looking for, but the least value v for which you can create K continuous groups, sum of each would be greater than or equal to v. To ensure you have as many acceptable subarrays for a given mid, you keep adding new elements to a subarray as long as the requirements (i.e., sum >= mid) are not met, and start forming a new subarray as soon as a subarray reaches that threshold. So, for each mid, numOfSubArrays returns you the maximum number of acceptable groups you can form. And overall, binary search just finds out the value v, the least value for which numOfSubArrays returns something greater than or equal to K.