Search code examples
cvxpy

Can I use cvxpy to split integer-2D-array to two arrays?


I have a problem that I wonder if I can solve using cvxpy:

The problem: I have a two dimensional integers array and I want to split it to two array in a way that each row of the source array is either in the 1st or 2nd array.

The requirement from these arrays us that for each column, the sum of integers in array #1 will be as close as possible to twice the sum of integers in array #2.

Example: Consider the input array:

[
  [1, 2, 3, 4],
  [4, 6, 2, 5],
  [3, 9, 1, 2],
  [8, 1, 0, 9],
  [8, 4, 0, 5],
  [9, 8, 0, 4]
]

The sums of its columns is [33, 30, 6, 29] so ideally we are looking for 2 arrays that the sums of their columns will be:

  • Array #1: [22, 20, 4, 19]
  • Array #2: [11, 10, 2, 10]

Off course this is not always possible but I looking for the best solution for this problem.

A possible solution for this specific example might be:

  • Array #1:
[
  [1, 2, 3, 4],
  [4, 6, 2, 5],
  [8, 4, 0, 5],
  [9, 8, 0, 4]
]

With column sums: [22, 20, 5, 18]

  • Array #2:
[
  [3, 9, 1, 2],
  [8, 1, 0, 9],
]

With column sums: [11, 10, 1, 11]

Any suggestions?


Solution

  • You can use a boolean vector variable to select rows. The only thing left to decide is how much to penalize errors. In this case I just used the norm of the difference vector.

    import cvxpy as cp
    import numpy as np
    data = np.array([
      [1, 2, 3, 4],
      [4, 6, 2, 5],
      [3, 9, 1, 2],
      [8, 1, 0, 9],
      [8, 4, 0, 5],
      [9, 8, 0, 4]
    ])
    x = cp.Variable(data.shape[0], boolean=True)
    prob = cp.Problem(cp.Minimize(cp.norm((x - 2 * (1 - x)) * data)))
    prob.solve()
    A = np.round(x.value) @ data
    B = np.round(1 - x.value) @ data
    

    A and B are the sum of rows.

    (array([21., 20.,  4., 19.]), array([12., 10.,  2., 10.]))