Search code examples
pythonoptimizationnon-linear-regressiongekko

'@error: Solution Not Found' for estimating system of equations


I'm trying to estimate parameters of a system of equations. I get an error return which isException: @error: Solution Not Found. Is it due to too few degree of freedoms? There seems no other information to deal with the errorNo solution.

Model and script are attached below:

System of Equations:

\[y_{jh} = \beta_{j0} + \sum_{k=1}^{K}\beta_{jk}x_{hk} + \epsilon_{jh}\]
<script type="text/javascript" src="https://www.hostmath.com/Math/MathJax.js?config=OK"></script>

where ßjk and ßj0 are parameters that are unkown and need to be estimated.

Objective function (Minimize Residuals):

\[\sum_{j=1}^{J}\sum_{h=1}^{H}\epsilon_{jh}\]
<script type="text/javascript" src="https://www.hostmath.com/Math/MathJax.js?config=OK"></script>

Constraints:

Some of rows in data contain missing value, so I add constraint on them. They are subject to:

\[\begin{align}
\frac{y_{jh}}{y_{j_{1}h}} &= \frac{\beta_{j0} + \sum_{k=1}^{K}\beta_{jk}x_{hk} + \epsilon_{jh}}{\beta_{j_{1}0} + \sum_{k=1}^{K}\beta_{j_{1}k}x_{hk} + \epsilon_{j_{1}h}}
\end{align}\]
<script type="text/javascript" src="https://www.hostmath.com/Math/MathJax.js?config=OK"></script>

where yj1h ​is the first non-missing point in yjh​, and yjh​ is non-missing points in row h.

Python Codes:

from gekko import GEKKO
import numpy as np

model = GEKKO(remote=True)

# =============================== simulated data =============================

h_size = 500  # sample size
k_xvar = 5  # number of X (variables)
j_cate = 5  # number of y (number of equations)

np.random.seed(1234)

data_X = np.random.normal(0, 10, size=(h_size, k_xvar+1))
data_X[:, 0] = 1  # intercept term

beta = [np.random.uniform(-10, 10, size=k_xvar+1) for _ in range(j_cate)]
data_y = np.array([
    data_X@beta[j] +
    np.random.normal(100, 10, size=h_size) for j in range(j_cate)
])
# randomly select 10% of observations and replace one value of each of them with np.nan
data_y[
    np.random.choice(data_y.shape[0], int(h_size/10), replace=True),
    np.random.choice(data_y.shape[1], int(h_size/10), replace=False)
] = np.nan

# get index of rows and cols where data is nan and non-nan
index_nan = np.where(np.isnan(data_y))
index_notnan = np.where(~np.isnan(data_y))

# ============================= gekko object =============================

beta_jk = model.Array(model.FV, (j_cate, k_xvar+1))
for j in range(j_cate):
    for k in range(k_xvar+1):
        beta_jk[j, k].value = 0
        beta_jk[j, k].STATUS = 1

error_jh = model.Array(model.FV, (j_cate, h_size))
for j in range(j_cate):
    for h in range(h_size):
        error_jh[j, h].value = 0
        error_jh[j, h].STATUS = 1
for j, h in zip(index_nan[0], index_nan[1]):  # where data is nan
    error_jh[j, h].status = 0

ym = model.Array(model.Param, (j_cate, h_size))
for j, h in zip(index_notnan[0], index_notnan[1]):
    ym[j, h].value = data_y[j, h]

# equations
for j, h in zip(index_notnan[0], index_notnan[1]):
    model.Equation(
        ym[j, h] == model.sum(
            beta_jk[j, :]*data_X[h, :]
        ) + error_jh[j, h]
    )

# constraints: the ratio y_j/y_1
if len(index_nan[1]) != 0:  # if there exists nan value
    for h in np.unique(index_nan[1]):
        j_notnan = np.where(~np.isnan(data_y[:, h]))[0].tolist()
        for j in j_notnan[1:]:
            model.Equation(
                (ym[j, h]/ym[j_notnan[0], h]) == (
                    (model.sum(beta_jk[j, :]*data_X[h, :])+error_jh[j, h])/(
                        model.sum(beta_jk[j_notnan[0], :]*data_X[h, :]) +
                        error_jh[j_notnan[0], h]
                    )
                )
            )

model.Minimize(
    model.sum(
        [(error_jh[j, h])**2 for j, h in zip(index_notnan[0], index_notnan[1])]
    )
)

# Application options
model.options.SOLVER = 1

model.solve(disp=True)

And the returns is:

 apm 222.29.98.194_gk_model5 <br><pre> ----------------------------------------------------------------
 APMonitor, Version 1.0.1
 APMonitor Optimization Suite
 ----------------------------------------------------------------

 
 --------- APM Model Size ------------
 Each time step contains
   Objects      :            1
   Constants    :            0
   Variables    :         7481
   Intermediates:            0
   Connections  :         2451
   Equations    :         5051
   Residuals    :         5051

 Number of state variables:           4931
 Number of total equations: -         5051
 Number of slack variables: -            0
 ---------------------------------------
 Degrees of freedom       :           -120

 * Warning: DOF <= 0
 ----------------------------------------------
 Steady State Optimization with APOPT Solver
 ----------------------------------------------

Iter    Objective  Convergence
......

 ---------------------------------------------------
 Solver         :  APOPT (v1.0)
 Solution time  :    42.2448999999906      sec
 Objective      :    55181039.5947782
 Unsuccessful with error code            0
 ---------------------------------------------------

 Creating file: infeasibilities.txt
 Use command apm_get(server,app,'infeasibilities.txt') to retrieve file
 @error: Solution Not Found
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python38\lib\site-packages\gekko\gekko.py", line 2185, in solve
    raise Exception(response)
Exception:  @error: Solution Not Found

How do I check where the error originates from and get successful solution?


Solution

  • The APOPT solver fails to find a solution. Switching to IPOPT with m.options.SOLVER=3 produces an error:

    This is Ipopt version 3.12.10, running with linear solver ma57.
    
    Number of nonzeros in equality constraint Jacobian...:    26601
    Number of nonzeros in inequality constraint Jacobian.:        0
    Number of nonzeros in Lagrangian Hessian.............:     4994
    
    Exception of type: TOO_FEW_DOF in file "IpIpoptApplication.cpp" at line 891:
    Exception message: status != TOO_FEW_DEGREES_OF_FREEDOM evaluated false:
      Too few degrees of freedom (rethrown)!
    
    EXIT: Problem has too few degrees of freedom.
     
     An error occured.
     The error code is          -10
    

    Switching to BPOPT (m.options.SOLVER=2) shows that there is a potential divide-by-zero issue that evaluates as NaN. Try rearranging the equation (a/b==1 to a==b to avoid this problem or adding constraints to make the denominator >0 or <0.

    model.Equation( (ym[j, h]/ym[j_notnan[0], h]) == (
       (model.sum(beta_jk[j, :]*data_X[h, :])+error_jh[j, h])/(
          model.sum(beta_jk[j_notnan[0], :]*data_X[h, :]) +
            error_jh[j_notnan[0], h]
       ))
    

    Rearranging the equation as:

    model.Equation(
      (ym[j, h] * (model.sum(beta_jk[j_notnan[0], :]*data_X[h, :]) +
         error_jh[j_notnan[0], h]) == (ym[j_notnan[0], h]) * 
         (model.sum(beta_jk[j, :]*data_X[h, :])+error_jh[j, h])))
    

    gives a successful solution.

     --------- APM Model Size ------------
     Each time step contains
       Objects      :            1
       Constants    :            0
       Variables    :         7481
       Intermediates:            0
       Connections  :         2451
       Equations    :         5051
       Residuals    :         5051
     
     Number of state variables:           4931
     Number of total equations: -         5051
     Number of slack variables: -            0
     ---------------------------------------
     Degrees of freedom       :           -120
     
     * Warning: DOF <= 0
     ----------------------------------------------
     Steady State Optimization with BPOPT Solver
     ----------------------------------------------
    
    -----------------------------------------------------
     BPOPT Solver v1.0.6
    -----------------------------------------------------
     Iter    Objective  Convergence
        0  2.92792E+05  2.97344E+05
        1  1.93773E+04  2.39292E+05
        2  2.34740E+05  6.55490E-08
        3  2.38837E+05  1.55937E-09
        4  2.39278E+05  9.07134E-10
    Successful solution
     
     ---------------------------------------------------
     Solver         :  BPOPT (v1.0)
     Solution time  :    86.3358000000007      sec
     Objective      :    239292.116505756     
     Successful solution
     ---------------------------------------------------
    

    Here is the full code:

    from gekko import GEKKO
    import numpy as np
    
    model = GEKKO(remote=True)
    
    # =============================== simulated data =============================
    
    h_size = 500  # sample size
    k_xvar = 5  # number of X (variables)
    j_cate = 5  # number of y (number of equations)
    
    np.random.seed(1234)
    
    data_X = np.random.normal(0, 10, size=(h_size, k_xvar+1))
    data_X[:, 0] = 1  # intercept term
    
    beta = [np.random.uniform(-10, 10, size=k_xvar+1) for _ in range(j_cate)]
    data_y = np.array([
        data_X@beta[j] +
        np.random.normal(100, 10, size=h_size) for j in range(j_cate)
    ])
    # randomly select 10% of observations and replace one value of each of them with np.nan
    data_y[
        np.random.choice(data_y.shape[0], int(h_size/10), replace=True),
        np.random.choice(data_y.shape[1], int(h_size/10), replace=False)
    ] = np.nan
    
    # get index of rows and cols where data is nan and non-nan
    index_nan = np.where(np.isnan(data_y))
    index_notnan = np.where(~np.isnan(data_y))
    
    # ============================= gekko object =============================
    
    beta_jk = model.Array(model.FV, (j_cate, k_xvar+1))
    for j in range(j_cate):
        for k in range(k_xvar+1):
            beta_jk[j, k].value = 0
            beta_jk[j, k].STATUS = 1
    
    error_jh = model.Array(model.FV, (j_cate, h_size))
    for j in range(j_cate):
        for h in range(h_size):
            error_jh[j, h].value = 0
            error_jh[j, h].STATUS = 1
    for j, h in zip(index_nan[0], index_nan[1]):  # where data is nan
        error_jh[j, h].status = 0
    
    ym = model.Array(model.Param, (j_cate, h_size))
    for j, h in zip(index_notnan[0], index_notnan[1]):
        ym[j, h].value = data_y[j, h]
    
    # equations
    for j, h in zip(index_notnan[0], index_notnan[1]):
        model.Equation(
            ym[j, h] == model.sum(
                beta_jk[j, :]*data_X[h, :]
            ) + error_jh[j, h]
        )
    
    # constraints: the ratio y_j/y_1
    if len(index_nan[1]) != 0:  # if there exists nan value
        for h in np.unique(index_nan[1]):
            j_notnan = np.where(~np.isnan(data_y[:, h]))[0].tolist()
            for j in j_notnan[1:]:
                model.Equation(
                    (ym[j, h] * (model.sum(beta_jk[j_notnan[0], :]*data_X[h, :]) +
                            error_jh[j_notnan[0], h]
                        ) == (
                        ym[j_notnan[0], h]) * 
                        (model.sum(beta_jk[j, :]*data_X[h, :])+error_jh[j, h])
                    )
                )
    
    model.Minimize(
        model.sum(
            [(error_jh[j, h])**2 for j, h in zip(index_notnan[0], index_notnan[1])]
        )
    )
    
    # Application options
    model.options.SOLVER = 2 # BPOPT solver
    
    model.solve(disp=True)