Search code examples
pythonoptimizationconstraintsleast-squareslmfit

How to solve a least square problem with LMFIT with non-linear constraints?


I have a least square problem at hand: Ax=b

Without any constraint, it can be solved like below:

import numpy as np
from lmfit import Model, Parameters, Minimizer, report_fit
from numpy.linalg import lstsq

A = np.array([[ 143, -144, -343, 56, 98, 54]
,[-7632,  77, -277, 63, 999, 345 ]
,[ 55, -75,   74, 744, 765, 77]
,[-22 , 177, -28, 12, 54, 577]
,[-848 , -433 , 121, 54, 643, 88]
,[98, 75, 155, 87, 23, 54]
,[12, 56, 44, 44, 86, 46]
,[75, 22, 111, 965, 486, 345]])

b = [-114, -734, -270,  577,  676, 122, 245, 654]
b = np.reshape(b, (8, 1))

x, residuals, rnk, s = lstsq(A, b, rcond=-1)
print (" x =", x)

 x = [[-0.04886372]
 [-2.38958265]
 [ 1.44719216]
 [ 0.42944648]
 [-1.23440929]
 [ 1.98725466]]

I want to add a repetitive constraint on each pair of results and solve this problem by using LMFIT since it seems to be more convenient to define the constraints. I tried this but it doesn't work. The constraint is that, for example, for the first pair of results x1 must be positive and -x1 <= x2 <= x1 and the same pattern for the next pairs.

    def fcn2min(params, A, b):

        x1 = params['x1']
        x2 = params['x2']
        x3 = params['x3']
        x4 = params['x4']
        x5 = params['x5']
        x6 = params['x6']
        x = np.array([x1, x2, x3, x4, x5, x6])
        x = np.reshape(x, (6, 1))

        return np.sum((np.dot(A, x)-b)**2)


    params = Parameters()
    params.add('x1', min=0)
    params.add('xangle1', value=0.05, vary=True, min=(-np.pi/2)+(0.00000000001), max=(np.pi/2)-(0.00000000001))
    params.add('x2', expr='(x1)*(sin(xangle1))')
    params.add('x3', min=0)
    params.add('xangle2', value=0.05, vary=True, min=(-np.pi/2)+(0.00000000001), max=(np.pi/2)-(0.00000000001))
    params.add('x4', expr='(x3)*(sin(xangle2))')
    params.add('x5', min=0)
    params.add('xangle3', value=0.05, vary=True, min=(-np.pi/2)+(0.00000000001), max=(np.pi/2)-(0.00000000001))
    params.add('x6', expr='(x5)*(sin(xangle3))')


    mini = Minimizer(fcn2min, params, fcn_args=(A, b))
    out = mini.leastsq()

    print (" report_fit(result) =", report_fit(out.params))

First, it doesn't work and I have difficulty to define and solve the problem. Second, it would be pretty hard to manually define the parameters and constraints for a large scale problem, let's say when the x has 100 or 1000 elements. Any help to handle this problem is appreciated. Thank you.


Solution

  • It is always helpful to say what you actually got. Saying "it doesn't work" is basically useless. You should report the output and any error messages you get. Including a description of what you think you should get is also sometimes helpful.

    A couple of points:

    • You need to NOT return np.sum((np.dot(A, x)-b)**2) and return np.dot(A, x)-b instead. The fit will do the squaring and summing.
    • You should give explicit initial values for the variable parameters.

    With that, you will get a fit that runs to completion and satisfies your constraints:

    def fcn2min(params, A, b):
        x1 = params['x1']
        x2 = params['x2']
        x3 = params['x3']
        x4 = params['x4']
        x5 = params['x5']
        x6 = params['x6']
        x = np.array([x1, x2, x3, x4, x5, x6]).reshape((6, 1))
        return  np.dot(A, x) - b
    
    params = Parameters()
    params.add('x1', 0.25, min=0)
    params.add('xangle1', value=0.05, vary=True, min=(-np.pi/2)+(0.00000000001), max=(np.pi/2)-(0.00000000001))
    params.add('x2', expr='(x1)*(sin(xangle1))')
    params.add('x3', 0.25, min=0)
    params.add('xangle2', value=0.05, vary=True, min=(-np.pi/2)+(0.00000000001), max=(np.pi/2)-(0.00000000001))
    params.add('x4', expr='(x3)*(sin(xangle2))')
    params.add('x5', 0.25, min=0)
    params.add('xangle3', value=0.05, vary=True, min=(-np.pi/2)+(0.00000000001), max=(np.pi/2)-(0.00000000001))
    params.add('x6', expr='(x5)*(sin(xangle3))')
    
    mini = Minimizer(fcn2min, params, fcn_args=(A, b))
    out = mini.leastsq()
    report_fit(out, min_correl=0.5)
    

    which will print out:

    [[Fit Statistics]]
        # fitting method   = leastsq
        # function evals   = 75
        # data points      = 8
        # variables        = 6
        chi-square         = 597869.757
        reduced chi-square = 298934.879
        Akaike info crit   = 101.773493
        Bayesian info crit = 102.250143
    [[Variables]]
        x1:       0.21393627 +/- 0.19714856 (92.15%) (init = 0.25)
        xangle1: -0.01575861 +/- 9.59829799 (60908.27%) (init = 0.05)
        x2:      -0.00337120 +/- 2.05111043 (60842.16%) == '(x1)*(sin(xangle1))'
        x3:       0.89232913 +/- 1.36874096 (153.39%) (init = 0.25)
        xangle2: -0.86188487 +/- 2.50527077 (290.67%) (init = 0.05)
        x4:      -0.67734114 +/- 0.99201291 (146.46%) == '(x3)*(sin(xangle2))'
        x5:       0.89482649 +/- 1.32577015 (148.16%) (init = 0.25)
        xangle3:  1.57078737 +/- 398996.439 (25401047.11%) (init = 0.05)
        x6:       0.89482649 +/- 4.38420893 (489.95%) == '(x5)*(sin(xangle3))'
    [[Correlations]] (unreported correlations are < 0.500)
        C(x1, x5)           =  0.878
        C(x5, xangle3)      =  0.853
        C(xangle1, xangle3) =  0.829
        C(xangle1, x5)      =  0.808
        C(x1, xangle2)      = -0.802
        C(x3, xangle2)      =  0.732
        C(xangle2, x5)      = -0.729
        C(x1, xangle1)      =  0.663
        C(x1, xangle3)      =  0.640
        C(xangle2, xangle3) = -0.548
        C(xangle1, xangle2) = -0.531
    

    The uncertainties are very, very high, suggesting something may not be right with your model for the constraints.