While trying to solve a logistic regression problem using cvxpy I got a bunch of terminal outputs when calling the solve() function even though no print-outs were programmed. Furthermore, no information about the problem was printed to the terminal even though verbose was set to true and the optimal value could not be accessed.
I guess I'm doing something wrong in the problem formulation but can't quite figure out what it is.
The problem was defined as follows in the minimal code example:
import numpy as np
import cvxpy as cp
y_vec = np.random.choice([0, 1], size=(728,), p=[9./10, 1./10])
M_mat = np.random.choice([0, 1], size=(728,801), p=[9./10, 1./10])
beta = cp.Variable(M_mat.shape[0])
objective = 0
for i in range(400):
objective += y_vec[i] * M_mat[:, i].T @ beta - \
cp.log(1 + cp.exp((M_mat[:, i].T @ beta)))
prob = cp.Problem(cp.Maximize(objective))
prob.solve(verbose=True)
print("Optimal var reached", beta.value)
Both y_vec
and M_mat
are numpy arrays with data type int64. Both are selection matrices for the classification problem consisting of only 0 and 1. For the purpose of the minimal code example they are randomly generated to reproduce the error. Furthermore M_mat[:, i].T @ beta
was checked to result in a scalar as intended.
When i execute the code i get printouts a lot of printouts like these with the program terminating after a certain number.
Shown here is only the end of the print outs when the program terminates. But there are many blocks of the form log(1.0 + exp([ 0. 0. ...... 0.] * var0))
where this output sequence is of the same length as the variable beta.
I find this result quite confusing. How can i arrive at a single vector for the optimization argument beta? Any help is much appreciated!
After some trial and error i found out that using the cvxpy.logistic()function somehow results in a successful computation of the solution with a desired output vector.
This was achieved by reformulating the objective function as follows:
objective = 0
for i in range(400):
objective += y_vec[i] * M_mat[:, i].T @ beta - cp.logistic(M_mat[:, i].T @ beta)
Even though both implementations should mathematically be the same according to Atomic Functions - CVXPY it results in drastically different outputs. Why this is the case i don't know. I hope the solution might nonetheless be useful for somebody and I'm curious to know more why the behavior is so different if someone knows more.