Search code examples
pythonoptimizationgekko

Gekko solver error "'results.json' not found", and being unable to pinpoint the reason


I'm getting the following error message while executing a constrained regression by using Gekko:

>  ----------------------------------------------------------------
 APMonitor, Version 1.0.1
 APMonitor Optimization Suite
 ----------------------------------------------------------------
 
 
 --------- APM Model Size ------------
 Each time step contains
   Objects      :            2
   Constants    :            0
   Variables    :          119
   Intermediates:            0
   Connections  :           58
   Equations    :           59
   Residuals    :           59
 
 Number of state variables:        3535109
 Number of total equations: -      3535080
 Number of slack variables: -            0
 ---------------------------------------
 Degrees of freedom       :             29
 
 ----------------------------------------------
 Model Parameter Estimation with APOPT Solver
 ----------------------------------------------

Error: forrtl: severe (174): SIGSEGV, segmentation fault occurred

Stack trace terminated abnormally.

Error: 'results.json' not found. Check above for additional error details

To find the reason I looked into the gekko path and found the following files:

     APOPT.out
     dbs_read.rpt
     gk_model0.apm
     gk_model0.csv
     gk_model0.info
     measurements.dbs

So I'm not being able to understand what is causing the issue as there is no infeasibilty file being generated. Following is the code I'm using:

def gk_enet(y, x, min_l2, indep_vars, sign_dict, l1_penalty=True):

    # setting the Gekko solver
    m = gk.GEKKO(remote=False)
    m.options.MAX_ITER = 100000
    m.options.IMODE = 2  # Regression mode
    opt_path = m.path
    print(m.path)
    model_iter = opt_path.rsplit("_", 1)[1]
    

    # assigning gekko solver to minimize SSE instead of default l1-norm minimizer
    if l1_penalty != True:
        m.options.EV_TYPE = 2

    # setting up the model & parameters

    # number of variables
    n = x.shape[1]
    
    # assigning array of parameters: c stands for coefficients
    # assigning sign-restrictions based on sign_dict
    c = m.Array(m.FV, n)
    for i, ci in enumerate(c):
        ci.STATUS = 1
        if sign_dict[indep_vars[i]] > 0:
            ci.LOWER = 0
        elif sign_dict[indep_vars[i]] < 0:
            ci.UPPER = 0

    # parameter for l2-regularization
    R = m.FV() 
    R.STATUS = 1
    R.LOWER = min_l2

    # array of gekko variable to predict the coefficients' effects
    x_pred = [None]*n  

    # load data
    xd = m.Array(m.Param, n)
    #xd_intcpt = m.Array(m.Param, 1)
    yd = m.Param(value=y)
    
    for i in range(n):
        xd[i].value = x[:, i]
        x_pred[i] = m.Var()  # y = sum(c[i]*x[i])
        m.Equation(x_pred[i] == c[i]*xd[i])

    y_pred = m.Var()
    m.Equation(y_pred == m.sum([x_pred[i] for i in range(n)]))
    
    l2_penalty = m.Var()
    m.Equation(l2_penalty == R*m.sum([c[i]**2 for i in range(n)]))
    
    # Minimize difference between actual and predicted y + l2-regularization factor
    m.Minimize((yd-y_pred)**2 + l2_penalty)

    # APOPT solver
    m.options.SOLVER = 1
    # Solve
    try:
        m.solve(disp=True)

        a = [np.round(i.value[0], 4) for i in c]
    except:
        a = [0 for i in c]
        
    return a, min_l2

Any suggestion to tackle this issue would be highly helpful. Also, please let me know if any further info is required for getting a solution.

Edit: Thanks John for the suggestions. I've reduced the data and re-ran the code with first BPOPT, followed by IPOPT solver. While running BPOPT solver I got the following error:

MATRIX IS SINGULAR. RANK=    0
Problem with linear solver, INFO:            3
Error in initialization of lagrange multipliers
Setting lam(1:m) = 0

While running the IPOPT, got the following message:

solver            3  not supported
 using default solver: APOPT

But this time APOPT solver provided a solution fairly quickly. I am yet to check the results. But looking at the BPOPT error message I think the issue lies in the data, even for the previous case with untrimmed data. Is there any debugging option in Gekko which I can try to figure that out?


Solution

  • The solver crashed with a segmentation fault because it couldn't access the memory it needed. The size of the optimization problem:

     Number of state variables:        3535109
     Number of total equations: -      3535080
     Number of slack variables: -            0
     ---------------------------------------
     Degrees of freedom       :             29
    

    shows that it is likely near the upper limit of what the APOPT solver can successfully handle.

    Suggestion 1: Model Reduction

    Try using m.Intermediate() instead of variables if there is no constraint on ypred or xpred.

    def gk_enet(y, x, min_l2, indep_vars, sign_dict, l1_penalty=True):
    
        # setting the Gekko solver
        m = gk.GEKKO(remote=False)
        m.options.MAX_ITER = 100000
        m.options.IMODE = 2  # Regression mode
        opt_path = m.path
        print(m.path)
        model_iter = opt_path.rsplit("_", 1)[1]
        
    
        # assigning gekko solver to minimize SSE instead of default l1-norm minimizer
        if l1_penalty != True:
            m.options.EV_TYPE = 2
    
        # setting up the model & parameters
    
        # number of variables
        n = x.shape[1]
        
        # assigning array of parameters: c stands for coefficients
        # assigning sign-restrictions based on sign_dict
        c = m.Array(m.FV, n)
        for i, ci in enumerate(c):
            ci.STATUS = 1
            if sign_dict[indep_vars[i]] > 0:
                ci.LOWER = 0
            elif sign_dict[indep_vars[i]] < 0:
                ci.UPPER = 0
    
        # parameter for l2-regularization
        R = m.FV() 
        R.STATUS = 1
        R.LOWER = min_l2
    
        # array of gecko variable to predict the coefficients' effects
        x_pred = [None]*n  
    
        # load data
        xd = m.Array(m.Param, n)
        #xd_intcpt = m.Array(m.Param, 1)
        yd = m.Param(value=y)
        
        for i in range(n):
            xd[i].value = x[:, i]
            x_pred[i] = m.Intermediate(c[i]*xd[i])
    
        y_pred = m.Intermediate(m.sum([x_pred[i] for i in range(n)]))
        
        l2_penalty = m.Intermediate(R*m.sum([c[i]**2 for i in range(n)]))
        
        # Minimize difference between actual and predicted y + l2-regularization factor
        m.Minimize((yd-y_pred)**2 + l2_penalty)
    
        # APOPT solver
        m.options.SOLVER = 1
        # Solve
        try:
            m.solve(disp=True)
    
            a = [np.round(i.value[0], 4) for i in c]
        except:
            a = [0 for i in c]
            
        return a, min_l2
    

    Suggestion 2: Switch Solver

    Try switching to another solver such as IPOPT or BPOPT:

    m.options.SOLVER = 'IPOPT'
    

    There is no indication that a Mixed Integer solution is needed so switching from a Mixed Integer Nonlinear Programming (MINLP) solver with APOPT to a Nonlinear Programming (NLP) solver with IPOPT may help. If the problem is Quadratically Constrained Quadratic Program (QCQP), there are specialized solvers that may work better than NLP solvers.

    If you can submit a complete problem with randomly generated data, the developers of the APOPT solver can use it to find the error. You can submit the problem as a new Issue on the Gekko GitHub or APOPT GitHub repositories.