Search code examples
c++algorithmoptimizationgreedy

Finding an increasing sequence a[] which minimizes sigma(abs(a[i]+c[i]))


Problem statement

c is a given array of n integers; the problem is to find an increasing array of n integers a (a[i] <= a[i+1]) such that this sum is minimized:

abs(a[0]+c[0]) + abs(a[1]+c[1]) + ... + abs(a[n-1]+c[n-1])
// abs(x) = absolute value of x  

An optimal a exists only made by integers appeared in c so we can solve it using DP in O(n^2):

dp[i][j]: a[i] >= j'th integer  

But there should be a faster solution, probably O(n lg n).


Solution

  • Update: I add the solution, which minimizes sum-of-absolute-values. Other solution, which minimizes sum-of-squares, is still here, at the end of this post, in case someone is interested.

    Minimize sum-of-absolute-values algorithm

    I start with the algorithm, that works only with the array of non-negative integers. Then it will be extended to any integers (or even to non-integer objects).

    This is a greedy algorithm. It uses bitwise representation of integers. Start with the most significant bit of each array's element (and ignore other bits for a while). Find largest prefix, that maximizes ones/zeros balance. Now clear all the array values, belonging to prefix and having zero most significant bit (zero all bits of these values). And for all the array values in the suffix, that have non-zero most significant bit, set all other bits to "one". Apply this algorithm recursively to both prefix and suffix using next bit as "most significant".

    This splits the original array into segments. You can find median of each segment and fill the output array with this median. Alternatively, just set corresponding bits in the output array when processing prefixes and leave them zero when dealing with suffixes.

    All this works because minimizing sum-of-absolute-values requires to find the median of subarrays, and while finding this median, you can compare values very approximately, always using only a single most-significant bit for the whole array and descending to other bits later, for subarrays.

    Here is C++11 code snippet, which explains the details:

    //g++ -std=c++0x
    #include <iostream>
    #include <vector>
    #include <iomanip>
    
    using namespace std;
    typedef vector<unsigned> arr_t;
    typedef arr_t::iterator arr_it;
    
    void nonincreasing(arr_it array, arr_it arrayEnd, arr_it out, int bits)
    {
      if (bits != -1)
      {
        int balance = 0;
        int largestBalance = -1;
        arr_it prefixEnd = array;
    
        for (arr_it i = array; i != arrayEnd; ++i)
        {
          int d = ((*i >> bits) & 1)? 1: -1;
          balance += d;
          if (balance > largestBalance)
          {
            balance = largestBalance;
            prefixEnd = i + 1;
          }
        }
    
        for (arr_it i = array; i != prefixEnd; ++i)
        {
          *(out + (i - array)) += (1 << bits);
          if (!((*i >> bits) & 1))
          {
            *i = 0;
          }
        }
        nonincreasing(array, prefixEnd, out, bits - 1);
    
        for (arr_it i = prefixEnd; i != arrayEnd; ++i)
        {
          if ((*i >> bits) & 1)
          {
            *i = (1 << bits) - 1;
          }
        }
        nonincreasing(prefixEnd, arrayEnd, out + (prefixEnd - array), bits - 1);
      }
    }
    
    void printArray(const arr_t& array)
    {
      for (auto val: array)
        cout << setw(2) << val << ' ';
      cout << endl;
    }
    
    int main()
    {
      arr_t array({12,10,10,17,6,3,9});
      arr_t out(array.size());
      printArray(array);
    
      nonincreasing(begin(array), end(array), begin(out), 5);
      printArray(out);
    
      return 0;
    }
    

    To work with any integers, not just positive, there are two alternatives:

    1. Find minimum integer in the input array and subtract it from other elements. When done with the main algorithm, add it back (and negate the result). This gives complexity O(N log U), where U is range of the array's values.
    2. Compact values of the input array. Sort it by value, remove duplicates, and instead of the original values, use index of this array. When done with the main algorithm, change indexes back to corresponding values (and negate the result). This gives complexity O(N log H), where H is the number of unique input array's values. Also this allows using not only integers, but any objects which may be ordered (compared to each other).

    Minimize sum-of-squares algorithm

    Here is a high level description of this algorithm. Complexity is O(N).

    Start with searching of a subarray, starting at the beginning of c[] and having largest possible average value. Then fill subarray of the same length in a[] with this average value (rounded to nearest integer and negated). Then remove this subarray from a[] and c[] (in other words, assume the beginning of a[] and c[] is moved forward by subarray's length) and recursively apply this algorithm to the remaining parts of a[] and c[].

    Most interesting part of this algorithm is searching of largest subarray. Fill a temporary array b[] with cumulative sum of elements from c[]: b[0] = c[0], b[1] = b[0] + c[1], ... Now you can determine average of any interval in c[] with this: (b[i+m] - b[i]) / m. By coincidence, exactly the same formula (maximization of its value) determines a tangent line from b[i] to the curve, described by b[]. So you can find all maximum values (as well as subarray bounds), needed for this algorithm, at once, using any Convex hull algorithm. Convex hull algorithms usually work with points in two dimensions and have super-linear complexity. But in this case, points are already sorted in one dimension, so Graham scan or Monotone Chain algorithm do the task in O(N) time, which also determines complexity of the whole algorithm.


    Pseudocode for this algorithm:

    1. b[] = Integrate(c[])
    2. h[] = ConvexHull(b[])
    3. a[] = - Derivative(h[])

    Visualization of the example array processing:

    example