Search code examples
pythonnumpyscipylinear-algebraleast-squares

Numpy/scipy - How to find the least squares solution with the constraint that Ax >= b?


I have a linear algebra problem that I can't find the correct function for; I need to find the vector x that satisfies Ax = b. Because A is non-square (there's 5 columns and something like 37 rows), np.linalg.solve(A,b) will not work - most search results I've seen have pointed me towards np.linalg.lstsq instead, or scipy.optimize.nnls if x must be strictly non-negative (which I do want).

The problem is that I need to also add on the constraint that the resulting Ax must be strictly greater-than-or-equal-to b; that is, no element of Ax can be less than the corresponding element of b. This is what I'm currently doing:

import numpy as np
import scipy

v1 = np.array([0.00000961,0.0000011,0.0000011,0.000015,0.00000884,0.00000286,0,0.00000006,0,0.000196,0,0.0000071,0.000000023,0.000131,0.00038,0,0,0.00000161,0,0,0.0000069,0.00027,0.000005,0,0,0.00054,0.00475,0.000000002,0.00036,0.0000032,0,0,0.033,0.0015,0.02,0.00052,0.207]).T
v2 = np.array([0,0.0000064,0.00000135,0.000121,0.0000177,0.00000348,0,0.00000006,0,0,0,0.0000833,0,0.000525,0.00062,0,0,0.0000114,0,0,0.0000458,0.00168,0.0000193,0,0,0.00376,0.00705,0.000000072,0.00018,0.0000327,0,0,0.085,0.492,0.258,0.0628,0.161]).T
v3 = np.array([0,0.000009,0.00000193,0.0000196,0.00000899,0,0,0.00000444,0,0,0,0.0000021,0.000000056,0.000664,0.00123,0,0,0.00000841,0,0,0.0000502,0.00171,0.0000106,0,0,0.00352,0.0148,0.000000032,0.00005,0.0000365,0,0,0.155,0.0142,0.216,0.00366,0.624]).T
v4 = np.array([0,0.00000763,0.00000139,0.00000961,0.0000135,0.00000119,0,0.00000056,0,0,0,0,0,0,0.00054,0,0,0.00000626,0,0,0.0000472,0.00177,0.0000492,0,0,0.00523,0.00429,0,0.00002,0.0000397,0,0,0.106,0.069,0.169,0.0122,0.663]).T
v5 = np.array([0.00000241,0.00000113,0.00000347,0.0000118,0.0000037,0.00000147,0,0.00000062,0,0.000934,0,0,0,0.000005,0.00254,0,0,0.00000053,0,0,0.000016,0.00033,0.0000092,0,0,0.00055,0.00348,0,0.00053,0.0000039,0,0,0.041,0.0149,0.0292,0.00178,0.0442]).T

b = np.array([0.00657,0.00876,0.00949,0.1168,0.0365,0.01241,0.000219,0.00292,0,0.657,0.000146,0.1095,0.000876,4.015,8.76,16.79,0.0002555,0.00657,0.0292,0.001095,0.0584,3.066,0.01679,0.0003285,0,5.11,24.82,0.0004015,10.95,0.0803,0,0.00219,204.4,569.4,365,146,2007.5]).T

A = np.column_stack((v1,v2,v3,v4,v5))

result = scipy.optimize.nnls(A,b)

x = result[0][:,None] # Force to be column array; the A @ x step doesn't seem to work if I don't do this
                      # https://stackoverflow.com/questions/64869851/converting-numpy-array-to-a-column-vector

lsq = A @ x # result is a column vector
diff = lsq.T - b # b is still technically a row vector, so lsq has to be converted back to a row vector for dimensions to match
print(diff)

If Ax >= b, then all elements of diff (= Ax - b) should be positive (or 0). Instead, what I'm getting is that some elements of diff are negative.

Is there any function that can be used to force the Ax >= b constraint?

(The context for why this constraint has to be imposed is that A encodes the amount of vitamins/minerals/macronutrients (rows) in 5 different foods (columns), and b is the total amount of each nutrient that has to be consumed to avoid a nutritional deficiency.)


Solution

  • It doesn't sound like the least squares criterion is the important part; you just want a solution that is not too wasteful. In that case, as Nick pointed out, you could use linear programming to minimimize the total food required. (A common variant of this problem is to minimize the cost of the food consumed. You could also minimize other things, like the sum or maximum of the excess nutrition, but that's a little more complicated, and the objective function is not the biggest issue here...)

    import numpy as np
    from scipy import optimize
    
    # Data copied from above
    v1 = np.array([0.00000961,0.0000011,0.0000011,0.000015,0.00000884,0.00000286,0,0.00000006,0,0.000196,0,0.0000071,0.000000023,0.000131,0.00038,0,0,0.00000161,0,0,0.0000069,0.00027,0.000005,0,0,0.00054,0.00475,0.000000002,0.00036,0.0000032,0,0,0.033,0.0015,0.02,0.00052,0.207]).T
    v2 = np.array([0,0.0000064,0.00000135,0.000121,0.0000177,0.00000348,0,0.00000006,0,0,0,0.0000833,0,0.000525,0.00062,0,0,0.0000114,0,0,0.0000458,0.00168,0.0000193,0,0,0.00376,0.00705,0.000000072,0.00018,0.0000327,0,0,0.085,0.492,0.258,0.0628,0.161]).T
    v3 = np.array([0,0.000009,0.00000193,0.0000196,0.00000899,0,0,0.00000444,0,0,0,0.0000021,0.000000056,0.000664,0.00123,0,0,0.00000841,0,0,0.0000502,0.00171,0.0000106,0,0,0.00352,0.0148,0.000000032,0.00005,0.0000365,0,0,0.155,0.0142,0.216,0.00366,0.624]).T
    v4 = np.array([0,0.00000763,0.00000139,0.00000961,0.0000135,0.00000119,0,0.00000056,0,0,0,0,0,0,0.00054,0,0,0.00000626,0,0,0.0000472,0.00177,0.0000492,0,0,0.00523,0.00429,0,0.00002,0.0000397,0,0,0.106,0.069,0.169,0.0122,0.663]).T
    v5 = np.array([0.00000241,0.00000113,0.00000347,0.0000118,0.0000037,0.00000147,0,0.00000062,0,0.000934,0,0,0,0.000005,0.00254,0,0,0.00000053,0,0,0.000016,0.00033,0.0000092,0,0,0.00055,0.00348,0,0.00053,0.0000039,0,0,0.041,0.0149,0.0292,0.00178,0.0442]).T
    
    b = np.array([0.00657,0.00876,0.00949,0.1168,0.0365,0.01241,0.000219,0.00292,0,0.657,0.000146,0.1095,0.000876,4.015,8.76,16.79,0.0002555,0.00657,0.0292,0.001095,0.0584,3.066,0.01679,0.0003285,0,5.11,24.82,0.0004015,10.95,0.0803,0,0.00219,204.4,569.4,365,146,2007.5]).T
    
    A = np.column_stack((v1,v2,v3,v4,v5))
    
    n = A.shape[1]  # 5
    
    # Minimimize the amount of food consumed
    c = np.ones(n)
    
    # The lower bound on `x` is 0 (upper bound is infinite by default)
    bounds = optimize.Bounds(lb=np.zeros(n))
    
    # The lower bound on `A@x` is `b` (upper bound is infinite by default)
    constraints = optimize.LinearConstraint(A, lb=b)
    
    res = optimize.milp(c, bounds=bounds, constraints=constraints)
    res
    #       message: The problem is infeasible. (HiGHS Status 8: model_status is Infeasible; primal_status is None)
    #         success: False
    #          status: 2
    ...
    

    Wait, what? Infeasible?

    Yes, because some nutrients you needed are just not present in the foods.

    np.where(np.sum(A, axis=1) == 0)[0]
    # array([ 6,  8, 10, 15, 16, 18, 19, 23, 24, 30, 31])
    

    Please offer more nutritious food.