Search code examples
pythonoptimizationterminalsolvercvxpy

Automatic unwanted terminal outputs while solving cvxpy optimization problem


While trying to solve a logistic regression problem using cvxpy I got a bunch of terminal outputs when calling the solve() function even though no print-outs were programmed. Furthermore, no information about the problem was printed to the terminal even though verbose was set to true and the optimal value could not be accessed.

I guess I'm doing something wrong in the problem formulation but can't quite figure out what it is.

The problem was defined as follows in the minimal code example:

import numpy as np
import cvxpy as cp

y_vec = np.random.choice([0, 1], size=(728,), p=[9./10, 1./10])
M_mat = np.random.choice([0, 1], size=(728,801), p=[9./10, 1./10])
beta = cp.Variable(M_mat.shape[0])
objective = 0
for i in range(400):
    objective += y_vec[i] * M_mat[:, i].T @ beta - \
        cp.log(1 + cp.exp((M_mat[:, i].T @ beta)))

prob = cp.Problem(cp.Maximize(objective))
prob.solve(verbose=True)
print("Optimal var reached",  beta.value)

Both y_vec and M_mat are numpy arrays with data type int64. Both are selection matrices for the classification problem consisting of only 0 and 1. For the purpose of the minimal code example they are randomly generated to reproduce the error. Furthermore M_mat[:, i].T @ beta was checked to result in a scalar as intended.

When i execute the code i get printouts a lot of printouts like these with the program terminating after a certain number.

End of the terminal print out after which the program is terminated

Shown here is only the end of the print outs when the program terminates. But there are many blocks of the form log(1.0 + exp([ 0. 0. ...... 0.] * var0)) where this output sequence is of the same length as the variable beta.

I find this result quite confusing. How can i arrive at a single vector for the optimization argument beta? Any help is much appreciated!


Solution

  • After some trial and error i found out that using the cvxpy.logistic()function somehow results in a successful computation of the solution with a desired output vector.

    This was achieved by reformulating the objective function as follows:

    objective = 0
    for i in range(400):
        objective += y_vec[i] * M_mat[:, i].T @ beta - cp.logistic(M_mat[:, i].T @ beta)
    

    Even though both implementations should mathematically be the same according to Atomic Functions - CVXPY it results in drastically different outputs. Why this is the case i don't know. I hope the solution might nonetheless be useful for somebody and I'm curious to know more why the behavior is so different if someone knows more.