Search code examples
numpyscipysparse-matrixmathematical-optimizationequation-solving

solving a sparse non linear system of equations using scipy.optimize.root


I want to solve the following non-linear system of equations.

enter image description here

Notes

  • the dot between a_k and x represents dot product.
  • the 0 in the first equation represents 0 vector and 0 in the second equation is scaler 0
  • all the matrices are sparse if that matters.

Known

  • K is an n x n (positive definite) matrix
  • each A_k is a known (symmetric) matrix
  • each a_k is a known n x 1 vector
  • N is known (let's say N = 50). But I need a method where I can easily change N.

Unknown (trying to solve for)

  • x is an n x 1 a vector.
  • each alpha_k for 1 <= k <= N a scaler

My thinking.

I am thinking of using scipy root to find x and each alpha_k. We essentially have n equations from each row of the first equation and another N equations from the constraint equations to solve for our n + N variables. Therefore we have the required number of equations to have a solution.

I also have a reliable initial guess for x and the alpha_k's.

Toy example.

n = 4
N = 2
K = np.matrix([[0.5, 0, 0, 0], [0, 1, 0, 0],[0,0,1,0], [0,0,0,0.5]])
A_1 = np.matrix([[0.98,0,0.46,0.80],[0,0,0.56,0],[0.93,0.82,0,0.27],[0,0,0,0.23]])
A_2 = np.matrix([[0.23, 0,0,0],[0.03,0.01,0,0],[0,0.32,0,0],[0.62,0,0,0.45]])
a_1 = np.matrix(scipy.rand(4,1))
a_2 = np.matrix(scipy.rand(4,1))

We are trying to solve for

 x = [x1, x2, x3, x4] and alpha_1, alpha_2

Questions:

  1. I can actually brute force this toy problem and feed it to the solver. But how do I do I solve this toy problem in such a way that I can extend it easily to the case when I have let's say n=50 and N=50
  2. I will probably have to explicitly compute the Jacobian for larger matrices??.

Can anyone give me any pointers?


Solution

  • I think the scipy.optimize.root approach holds water, but steering clear of the trivial solution might be the real challenge for this system of equations.

    In any event, this function uses root to solve the system of equations.

    def solver(x0, alpha0, K, A, a):
    '''
    x0     - nx1 numpy array. Initial guess on x.
    alpha0 - nx1 numpy array. Initial guess on alpha.
    K      - nxn numpy.array.
    A      - Length N List of nxn numpy.arrays.
    a      - Length N list of nx1 numpy.arrays.
    '''
    
    # Establish the function that produces the rhs of the system of equations.
    n = K.shape[0]
    N = len(A)
    def lhs(x_alpha):
        '''
        x_alpha is a concatenation of x and alpha.
        '''
    
        x = np.ravel(x_alpha[:n])
        alpha = np.ravel(x_alpha[n:])
        lhs_top = np.ravel(K.dot(x))
        for k in xrange(N):
            lhs_top += alpha[k]*(np.ravel(np.dot(A[k], x)) + np.ravel(a[k]))
    
        lhs_bottom = [0.5*x.dot(np.ravel(A[k].dot(x))) + np.ravel(a[k]).dot(x)
                      for k in xrange(N)]
    
        lhs = np.array(lhs_top.tolist() + lhs_bottom)
    
        return lhs
    
    # Solve the system of equations.
    x0.shape = (n, 1)
    alpha0.shape = (N, 1)
    x_alpha_0 = np.vstack((x0, alpha0))
    sol = root(lhs, x_alpha_0)
    x_alpha_root = sol['x']
    
    # Compute norm of residual.
    res = sol['fun']
    res_norm = np.linalg.norm(res)
    
    # Break out the x and alpha components.
    x_root = x_alpha_root[:n]
    alpha_root = x_alpha_root[n:]
    
    
    return x_root, alpha_root, res_norm
    

    Running on the toy example, however, only produces the trivial solution.

    # Toy example.
    n = 4
    N = 2
    K = np.matrix([[0.5, 0, 0, 0], [0, 1, 0, 0],[0,0,1,0], [0,0,0,0.5]])
    A_1 = np.matrix([[0.98,0,0.46,0.80],[0,0,0.56,0],[0.93,0.82,0,0.27],      
                    [0,0,0,0.23]])
    A_2 = np.matrix([[0.23, 0,0,0],[0.03,0.01,0,0],[0,0.32,0,0],
          [0.62,0,0,0.45]])
    a_1 = np.matrix(scipy.rand(4,1))
    a_2 = np.matrix(scipy.rand(4,1))
    A = [A_1, A_2]
    a = [a_1, a_2]
    x0 = scipy.rand(n, 1)
    alpha0 = scipy.rand(N, 1)
    
    print 'x0 =', x0
    print 'alpha0 =', alpha0
    
    x_root, alpha_root, res_norm = solver(x0, alpha0, K, A, a)
    
    print 'x_root =', x_root
    print 'alpha_root =', alpha_root
    print 'res_norm =', res_norm
    

    Output is

    x0 = [[ 0.00764503]
     [ 0.08058471]
     [ 0.88300129]
     [ 0.85299622]]
    alpha0 = [[ 0.67872815]
     [ 0.69693346]]
    x_root = [  9.88131292e-324  -4.94065646e-324   0.00000000e+000        
              0.00000000e+000]
    alpha_root = [ -4.94065646e-324   0.00000000e+000]
    res_norm = 0.0