Search code examples
matlablinear-programmingleast-squaresmixed-integer-programming

linear combination of curves to match a single curve with integer constraints


I have a set of vectors (curves) which I would like to match to a single curve. The issue isnt only finding a linear combination of the set of curves which will most closely match the single curve (this can be done with least squares Ax = B). I need to be able to add constraints, for example limiting the number of curves used in the fitting to a particular number, or that the curves lie next to each other. These constraints would be found in mixed integer linear programming optimization.

I have started by using lsqlin which allows constraints and have been able to limit the variable to be > 0.0, but in terms of adding further constraints I am at a loss. Is there a way to add integer constraints to least squares, or alternatively is there a way to solve this with a MILP?

any help in the right direction much appreciated!

Edit: Based on the suggestion by ErwinKalvelagen I am attempting to use CPLEX and its quadtratic solvers, however until now I have not managed to get it working. I have created a minimal 'notworking' example and have uploaded the data here and code here below. The issue is that matlabs LS solver lsqlin is able to solve, however CPLEX cplexlsqnonneglin returns CPLEX Error 5002: %s is not convex for the same problem.

function [ ] = minWorkingLSexample(  )
%MINWORKINGLSEXAMPLE for LS with matlab and CPLEX
%matlab is able to solve the least squares, CPLEX returns error:

% Error using cplexlsqnonneglin
% CPLEX Error  5002: %s is not convex.
%
%
% Error in Backscatter_Transform_excel2_readMut_LINPROG_CPLEX (line 203)
%         cplexlsqnonneglin (C,d);
%  

load('C_n_d_2.mat')

lb = zeros(size(C,2),1);
options = optimoptions('lsqlin','Algorithm','trust-region-reflective');
[fact2,resnorm,residual,exitflag,output] = ...
          lsqlin(C,d,[],[],[],[],lb,[],[],options);


%% CPLEX
ctype = cellstr(repmat('C',1,size(C,2)));
options = cplexoptimset;
options.Display = 'on';

[fact3, resnorm, residual, exitflag, output] = ...
    cplexlsqnonneglin (C,d);

end

Solution

  • I could reproduce the Cplex problem. Here is a workaround. Instead of solving the first model, use a model that is less nonlinear:

    enter image description here

    The second model solves fine with Cplex. The problem is somewhat of a tolerance/numeric issue. For the second model we have a much more well-behaved Q matrix (a diagonal). Essentially we moved some of the complexity from the objective into linear constraints.

    You should now see something like:

    Tried aggregator 1 time.
    QP Presolve eliminated 1 rows and 1 columns.
    Reduced QP has 401 rows, 443 columns, and 17201 nonzeros.
    Reduced QP objective Q matrix has 401 nonzeros.
    Presolve time = 0.02 sec. (1.21 ticks)
    Parallel mode: using up to 8 threads for barrier.
    Number of nonzeros in lower triangle of A*A' = 80200
    Using Approximate Minimum Degree ordering
    Total time for automatic ordering = 0.00 sec. (3.57 ticks)
    Summary statistics for Cholesky factor:
      Threads                   = 8
      Rows in Factor            = 401
      Integer space required    = 401
      Total non-zeros in factor = 80601
      Total FP ops to factor    = 21574201
     Itn      Primal Obj        Dual Obj  Prim Inf Upper Inf  Dual Inf          
       0   3.3391791e-01  -3.3391791e-01  9.70e+03  0.00e+00  4.20e+04
       1   9.6533667e+02  -3.0509942e+03  1.21e-12  0.00e+00  1.71e-11
       2   6.4361775e+01  -3.6729243e+02  3.08e-13  0.00e+00  1.71e-11
       3   2.2399862e+01  -6.8231454e+01  1.14e-13  0.00e+00  3.75e-12
       4   6.8012056e+00  -2.0011575e+01  2.45e-13  0.00e+00  1.04e-12
       5   3.3548410e+00  -1.9547176e+00  1.18e-13  0.00e+00  3.55e-13
       6   1.9866256e+00   6.0981384e-01  5.55e-13  0.00e+00  1.86e-13
       7   1.4271894e+00   1.0119284e+00  2.82e-12  0.00e+00  1.15e-13
       8   1.1434804e+00   1.1081026e+00  6.93e-12  0.00e+00  1.09e-13
       9   1.1163905e+00   1.1149752e+00  5.89e-12  0.00e+00  1.14e-13
      10   1.1153877e+00   1.1153509e+00  2.52e-11  0.00e+00  9.71e-14
      11   1.1153611e+00   1.1153602e+00  2.10e-11  0.00e+00  8.69e-14
      12   1.1153604e+00   1.1153604e+00  1.10e-11  0.00e+00  8.96e-14
    Barrier time = 0.17 sec. (38.31 ticks)
    
    Total time on 8 threads = 0.17 sec. (38.31 ticks)
    QP status(1): optimal
    Cplex Time: 0.17sec (det. 38.31 ticks)
    
    Optimal solution found.
    Objective :           1.115360
    

    See here for some details.

    Update: In Matlab this becomes:

    enter image description here