Why am I getting this DCPError?

I'm trying to optimize a binary portfolio vector to be greater than a benchmark using CVXPY.

import cvxpy as cp
import numpy as np

# Generate a random non-trivial quadratic program.

n = 10 # number of options

np.random.seed(1)
mu = np.random.randn(n) # expected means
var_covar = np.random.randn(n,n) # variance-covariance matrix
var_covar = var_covar.T.dot(var_covar) # cont'd
bench_cov = np.random.randn(n) # n-length vector of cov(benchmark, returns)

lamd = 0.01 # risk tolerance

# Define and solve the CVXPY problem.

x = cp.Variable(n, boolean=True)

prob = cp.Problem(cp.Maximize(mu.T@x + lamd * (cp.quad_form(x, var_covar) - (2 * bench_cov.T@x))), [cp.sum(x) == 4])

prob.solve()

I get this error using CVXPY version 1.1.0a0 (downloaded directly from github):

DCPError: Problem does not follow DCP rules. Specifically:

The objective is not DCP, even though each sub-expression is.

You are trying to maximize a function that is convex.

From what I've read maximizing a convex function is very difficult, but I got this equation from a paper. I figure I must be doing something wrong as I'm new to quadratic programming and CVXPY.

Thank you!

Solution

The problem with your model is that max x'Qx is non-convex. As we have binary variables x we can use a trick.

Define

y(i,j) = x(i)*x(j)

as extra binary variable. Then we can write

sum((i,j), x(i)*Q(i,j)*x(j))

sum((i,j), y(i,j)*Q(i,j))

The binary multiplication y(i,j) = x(i)*x(j) can be linearized as:

 y(i,j) <= x(i)
 y(i,j) <= x(j)
 y(i,j) >= x(i)+x(j)-1

With this reformulation we have a completely linear model. It is a MIP as we have binary variables.

We can do this in CVXPY as:

import numpy as np
import cvxpy as cp

# Generate a random non-trivial quadratic program.

n = 10 # number of options

np.random.seed(1)
mu = np.random.randn(n) # expected means
var_covar = np.random.randn(n,n) # variance-covariance matrix
var_covar = var_covar.T.dot(var_covar) # cont'd
bench_cov = np.random.randn(n) # n-length vector of cov(benchmark, returns)

lamd = 0.01 # risk tolerance

e = np.ones((1,n))

x = cp.Variable((n,1), "x", boolean=True)
y = cp.Variable((n,n), "y", boolean=True)


prob = cp.Problem(cp.Maximize(mu.T@x + lamd * (cp.sum(cp.multiply(y,var_covar)) -2*bench_cov.T@x) ),
                  [y <= x@e, y <= (x@e).T, y >= x@e + (x@e).T - e.T@e, cp.sum(x)==4 ])

prob.solve(solver=cp.ECOS_BB)
print("status",prob.status)
print("obj",prob.value)
print("x",x.value)

This gives the result:

status optimal
obj 4.765120794509871
x [[1.00000000e+00]
 [3.52931931e-10]
 [3.80644178e-10]
 [2.53300872e-10]
 [9.99999999e-01]
 [1.79871537e-10]
 [1.00000000e+00]
 [3.46298454e-10]
 [9.99999999e-01]
 [1.00172269e-09]]

Notes:

You are encouraged to use a better MIP solver than ECOS_BB. For this model it gives the correct results, but it is somewhat of a toy solver and is known to give problems on more difficult data sets.
I don't understand the economics of the model. We are maximizing risk here. It may not be prudent to base your investment decisions on the results of this model.
Note that some high-end solvers (like Cplex and Gurobi) do this reformulation automatically. However CVXPY will not allow you to pass on the non-convex model to the solver.