Search code examples
pythonoptimizationcvxpyconvex-optimizationquadratic-programming

CVXPY claims matrix is not symmetric and PSD despite my matrix being symmetric and PSD


I have a quadratic program (see below) that causes an error due to my quad_form matrix being not symmetric or PSD. Why is this happening? (Note that the issue is with my matrix P below so feel free to ignore the first block of code in the function.)

def run_QP(n, p, L, Y, X, sigma, B_prop):
  
  # Configuring our inputs to suit LP
  Y_LP = Y.flatten('F')
  X_LP = np.zeros((n*L,p*L))
  for l in range(L):
    X_LP[n*l:n*(l+1),p*l:p*(l+1)] = X
  C_LP = np.eye(p)
  for l in range(L-1):
    C_LP = np.concatenate((C_LP, np.eye(p)), axis=1)
  one_p = np.ones(p)

  # Setting the objective matrix and vector
  constant = 1/(p*cp.power(sigma,2))
  P = constant * (X_LP.T @ X_LP)
  I_pL = np.eye(p*L)
  temp_term = -1*cp.log(B_prop[0]) * (one_p.T @ I_pL[0:p,:])
  for l in range(1,L):
    I_pL_trun = I_pL[p*l:p*(l+1),:]
    temp_term -= cp.log(B_prop[l]) * (one_p.T @ I_pL_trun)
  q = -1*constant*(Y_LP.T@X_LP) + temp_term
  
  # Setting the equality constraints matrix
  A = np.concatenate((X_LP, C_LP), axis=0)
  # Setting the equality constraints vector
  b = np.concatenate((Y_LP, one_p), axis=0)

  # Define and solve the CVXPY problem.
  x = cp.Variable(p*L)
  prob = cp.Problem(cp.Minimize((1/2)*cp.quad_form(x, P) + q.T @ x),
                    [A @ x == b,])
  prob.solve()

  # Reconfiguring our outputs to suit pooled data
  B_QP_est = prob.value
  B_est = np.zeros((p,L))
  for l in range(L):
    B_col = B_QP_est[p*l:p*(l+1)]
    B_est[:,l] = B_col
  
  return B_est

B_bar_prob = [0.5,0.5]
L = 2
alpha = 0.5
p = 100
n = 20
sigma = 0.1

indices = np.random.choice(np.array([0, 1]), size=p, p=B_bar_prob)
B = np.eye(np.max(indices)+1)[indices]
X = binomial(1, alpha, (n, p))
Psi = normal(0, sigma, (n, L))
Y = np.dot(X, B) + Psi
B_prop = np.sum(B, axis=0) / n

B_hat = run_QP(n, p, L, Y, X, sigma, B_prop)

When I check for PSD using the following code (after swapping my cvxpy atomic functions out for numpy functions, I get true):

def isPSD(A, tol=1e-8):
  E = np.linalg.eigvalsh(A)
  return np.all(E > -tol)

Hence, I am very confused as to what is going on...

Edit: Thanks to the answer below from Michal Adamaszek, I have modified my program to work.

def run_QP(n, p, L, Y, X, sigma, B_prop):
  
  # Configuring our inputs to suit LP
  Y_LP = Y.flatten('F')
  X_LP = np.zeros((n*L,p*L))
  for l in range(L):
    X_LP[n*l:n*(l+1),p*l:p*(l+1)] = X
  C_LP = np.eye(p)
  for l in range(L-1):
    C_LP = np.concatenate((C_LP, np.eye(p)), axis=1)
  one_p = np.ones(p)

  # Setting the objective matrix and vector
  constant = 1/(p*(sigma**2))
  temp_term = np.zeros(p*L)
  for l in range(L):
    I_pL = np.eye(p*L)
    I_pL_trun = I_pL[p*l:p*(l+1),:]
    temp_term -= np.log(B_prop[l]) * np.dot(one_p.T,I_pL_trun)
  q = -1*constant*(Y_LP.T @ X_LP) + temp_term
  
  # Setting the equality constraints matrix
  A = np.concatenate((X_LP, C_LP), axis=0)
  # Setting the equality constraints vector
  b = np.concatenate((Y_LP, one_p), axis=0)

  # Define and solve the CVXPY problem.
  x = cp.Variable(p*L)
  objective = cp.Minimize((1/2)*constant*cp.sum_squares(X_LP @ x) + (q.T @ x))
  constraints = []
  constraints.append(A @ x == b)
  problem = cp.Problem(objective, constraints)
  problem.solve()
  print('optimal obj value:', problem.value)

  # Reconfiguring our outputs to suit pooled data
  B_QP_est = x.value
  B_est = np.zeros((p,L))
  for l in range(L):
    B_col = B_QP_est[p*l:p*(l+1)]
    B_est[:,l] = B_col
  
  return B_est

However, this always gives inf for the objective function, even when the conic part is zero. See the following output:

===============================================================================
                                     CVXPY                                     
                                     v1.3.1                                    
===============================================================================
(CVXPY) Jun 05 07:13:28 AM: Your problem has 200 variables, 1 constraints, and 0 parameters.
(CVXPY) Jun 05 07:13:28 AM: It is compliant with the following grammars: DCP, DQCP
(CVXPY) Jun 05 07:13:28 AM: (If you need to solve this problem multiple times, but with different data, consider using parameters.)
(CVXPY) Jun 05 07:13:28 AM: CVXPY will first compile your problem; then, it will invoke a numerical solver to obtain a solution.
-------------------------------------------------------------------------------
                                  Compilation                                  
-------------------------------------------------------------------------------
(CVXPY) Jun 05 07:13:28 AM: Compiling problem (target solver=OSQP).
(CVXPY) Jun 05 07:13:28 AM: Reduction chain: CvxAttr2Constr -> Qp2SymbolicQp -> QpMatrixStuffing -> OSQP
(CVXPY) Jun 05 07:13:28 AM: Applying reduction CvxAttr2Constr
(CVXPY) Jun 05 07:13:28 AM: Applying reduction Qp2SymbolicQp
(CVXPY) Jun 05 07:13:28 AM: Applying reduction QpMatrixStuffing
(CVXPY) Jun 05 07:13:28 AM: Applying reduction OSQP
(CVXPY) Jun 05 07:13:28 AM: Finished problem compilation (took 8.957e-02 seconds).
-------------------------------------------------------------------------------
                                Numerical solver                               
-------------------------------------------------------------------------------
(CVXPY) Jun 05 07:13:28 AM: Invoking solver OSQP  to obtain a solution.
-----------------------------------------------------------------
           OSQP v0.6.2  -  Operator Splitting QP Solver
              (c) Bartolomeo Stellato,  Goran Banjac
        University of Oxford  -  Stanford University 2021
-----------------------------------------------------------------
problem:  variables n = 240, constraints m = 180
          nnz(P) + nnz(A) = 4156
settings: linear system solver = qdldl,
          eps_abs = 1.0e-05, eps_rel = 1.0e-05,
          eps_prim_inf = 1.0e-04, eps_dual_inf = 1.0e-04,
          rho = 1.00e-01 (adaptive),
          sigma = 1.00e-06, alpha = 1.60, max_iter = 10000
          check_termination: on (interval 25),
          scaling: on, scaled_termination: off
          warm start: on, polish: on, time_limit: off

iter   objective    pri res    dua res    rho        time
   1  -9.5106e-01   3.09e+01   1.59e+07   1.00e-01   6.26e-03s
  50   1.0000e+30   2.49e-02   2.14e-04   1.00e-01   8.22e-03s

status:               primal infeasible
number of iterations: 50
run time:             8.35e-03s
optimal rho estimate: 3.87e+00

So when the conic part is zero, a simple linear program below gets me the optimal solution:

from scipy.optimize import linprog

def run_LP(n, p, L, Y, X, B_prop):
  
  # Configuring our inputs to suit LP
  Y_LP = Y.flatten('F')
  X_LP = np.zeros((n*L,p*L))
  for l in range(L):
    X_LP[n*l:n*(l+1),p*l:p*(l+1)] = X
  C_LP = np.eye(p)
  for l in range(L-1):
    C_LP = np.concatenate((C_LP, np.eye(p)), axis=1)
  one_p = np.ones(p)

  # Setting the objective vector
  c = np.zeros(p*L)
  for l in range(L):
    I_pL = np.eye(p*L)
    I_pL_trun = I_pL[p*l:p*(l+1),:]
    c -= np.log(B_prop[l]) * np.dot(one_p.T,I_pL_trun)
  
  # Setting the equality constraints matrix
  A = np.concatenate((X_LP, C_LP), axis=0)
  # Setting the equality constraints vector
  b = np.concatenate((Y_LP, one_p), axis=0)

  # Solving linear programming problem
  res = linprog(c, A_eq=A, b_eq=b)
  print("Linear program:", res.message)

  # Reconfiguring our outputs to suit pooled data
  B_LP_est = res.x
  B_est = np.zeros((p,L))
  for l in range(L):
    B_col = B_LP_est[p*l:p*(l+1)]
    B_est[:,l] = B_col
  
  return B_est

Why is this happening?


Solution

  • Internally CVXPY does something else to factorize the matrix, so if a check with eigvalsh reveals you are just borderline good, then the internal method used in CVXPY can tip the balance the other way.

    More importantly, there is no need to formulate your problem as a QP if you already have it on conic form to start with. This is a standard SOCP reformulation. Namely,

    (1/2)*cp.quad_form(x, P)
    

    is in your case equivalent to

    (1/2) * constant * cp.sum_squares(X_LP @ x)
    

    because

    x.T@P@x = constant * norm(X_LP@x)^2
    

    This way you avoid the roundabout way where you first construct P=X_LP.T@X_LP and then CVXPY has to work to factorize it back again. You also get a problem convex by construction, so no numerical issues will come in the way of the convexity check. More about this in https://docs.mosek.com/modeling-cookbook/cqo.html#conic-quadratic-modeling