python optimization terminal solver cvxpy

Automatic unwanted terminal outputs while solving cvxpy optimization problem

While trying to solve a logistic regression problem using cvxpy I got a bunch of terminal outputs when calling the solve() function even though no print-outs were programmed. Furthermore, no information about the problem was printed to the terminal even though verbose was set to true and the optimal value could not be accessed.

I guess I'm doing something wrong in the problem formulation but can't quite figure out what it is.

The problem was defined as follows in the minimal code example:

import numpy as np
import cvxpy as cp

y_vec = np.random.choice([0, 1], size=(728,), p=[9./10, 1./10])
M_mat = np.random.choice([0, 1], size=(728,801), p=[9./10, 1./10])
beta = cp.Variable(M_mat.shape[0])
objective = 0
for i in range(400):
    objective += y_vec[i] * M_mat[:, i].T @ beta - \
        cp.log(1 + cp.exp((M_mat[:, i].T @ beta)))

prob = cp.Problem(cp.Maximize(objective))
prob.solve(verbose=True)
print("Optimal var reached",  beta.value)

Both y_vec and M_mat are numpy arrays with data type int64. Both are selection matrices for the classification problem consisting of only 0 and 1. For the purpose of the minimal code example they are randomly generated to reproduce the error. Furthermore M_mat[:, i].T @ beta was checked to result in a scalar as intended.

When i execute the code i get printouts a lot of printouts like these with the program terminating after a certain number.

Shown here is only the end of the print outs when the program terminates. But there are many blocks of the form log(1.0 + exp([ 0. 0. ...... 0.] * var0)) where this output sequence is of the same length as the variable beta.

I find this result quite confusing. How can i arrive at a single vector for the optimization argument beta? Any help is much appreciated!

Solution

After some trial and error i found out that using the cvxpy.logistic()function somehow results in a successful computation of the solution with a desired output vector.

This was achieved by reformulating the objective function as follows:

objective = 0
for i in range(400):
    objective += y_vec[i] * M_mat[:, i].T @ beta - cp.logistic(M_mat[:, i].T @ beta)

Even though both implementations should mathematically be the same according to Atomic Functions - CVXPY it results in drastically different outputs. Why this is the case i don't know. I hope the solution might nonetheless be useful for somebody and I'm curious to know more why the behavior is so different if someone knows more.