Search code examples
pythonleast-squaresdata-fitting

Least squares fit in python for 3d surface


I would like to fit my surface equation to some data. I already tried scipy.optimize.leastsq but as I cannot specify the bounds it gives me an unusable results. I also tried scipy.optimize.least_squares but it gives me an error:

ValueError: too many values to unpack

My equation is:

f(x,y,z)=(x-A+y-B)/2+sqrt(((x-A-y+B)/2)^2+C*z^2)

parameters A, B, C should be found so that the equation above would be as close as possible to zero when the following points are used for x,y,z:

    [
   [-0.071, -0.85, 0.401],
   [-0.138, -1.111, 0.494],
   [-0.317, -0.317, -0.317],
   [-0.351, -2.048, 0.848]
   ]

The bounds would be A > 0, B > 0, C > 1

How I should obtain such a fit? What is the best tool in python to do that. I searched for examples on how to fit 3d surfaces but most of examples involving function fitting is about line or flat surface fits.


Solution

  • I've edited this answer to provide a more general example of how this problem can be solved with scipy's general optimize.minimize method as well as scipy's optimize.least_squares method.


    First lets set up the problem:

    import numpy as np
    import scipy.optimize
    
    # ===============================================
    # SETUP: define common compoments of the problem
    
    
    def our_function(coeff, data):
        """
        The function we care to optimize.
    
        Args:
            coeff (np.ndarray): are the parameters that we care to optimize.
            data (np.ndarray): the input data
        """
        A, B, C = coeff
        x, y, z = data.T
        return (x - A + y - B) / 2 + np.sqrt(((x - A - y + B) / 2) ** 2 + C * z ** 2)
    
    
    # Define some training data
    data = np.array([
        [-0.071, -0.85, 0.401],
        [-0.138, -1.111, 0.494],
        [-0.317, -0.317, -0.317],
        [-0.351, -2.048, 0.848]
    ])
    # Define training target
    # This is what we want the target function to be equal to
    target = 0
    
    # Make an initial guess as to the parameters
    # either a constant or random guess is typically fine
    num_coeff = 3
    coeff_0 = np.ones(num_coeff)
    # coeff_0 = np.random.rand(num_coeff)
    

    This isn't strictly least squares, but how about something like this? This solution is like throwing a sledge hammer at the problem. There probably is a way to use least squares to get a solution more efficiently using an SVD solver, but if you're just looking for an answer scipy.optimize.minimize will find you one.

    # ===============================================
    # FORMULATION #1: a general minimization problem
    
    # Here the bounds and error are all specified within the general objective function
    def general_objective(coeff, data, target):
        """
        General function that simply returns a value to be minimized.
        The coeff will be modified to minimize whatever the output of this function
        may be.
        """
        # Constraints to keep coeff above 0
        if np.any(coeff < 0):
            # If any constraint is violated return infinity
            return np.inf
        # The function we care about
        prediction = our_function(coeff, data)
        # (optional) L2 regularization to keep coeff small
        # (optional) reg_amount = 0.0
        # (optional) reg = reg_amount * np.sqrt((coeff ** 2).sum())
        losses = (prediction - target) ** 2
        # (optional) losses += reg
        # Return the average squared error
        loss = losses.sum()
        return loss
    
    
    general_result = scipy.optimize.minimize(general_objective, coeff_0,
                                             method='Nelder-Mead',
                                             args=(data, target))
    # Test what the squared error of the returned result is
    coeff = general_result.x
    general_output = our_function(coeff, data)
    print('====================')
    print('general_result =\n%s' % (general_result,))
    print('---------------------')
    print('general_output = %r' % (general_output,))
    print('====================')
    

    The output looks like this:

    ====================
    general_result =
     final_simplex: (array([[  2.45700466e-01,   7.93719271e-09,   1.71257109e+00],
           [  2.45692680e-01,   3.31991619e-08,   1.71255150e+00],
           [  2.45726858e-01,   6.52636219e-08,   1.71263360e+00],
           [  2.45713989e-01,   8.06971686e-08,   1.71260234e+00]]), array([ 0.00012404,  0.00012404,  0.00012404,  0.00012404]))
               fun: 0.00012404137498459109
           message: 'Optimization terminated successfully.'
              nfev: 431
               nit: 240
            status: 0
           success: True
                 x: array([  2.45700466e-01,   7.93719271e-09,   1.71257109e+00])
    ---------------------
    general_output = array([ 0.00527974, -0.00561568, -0.00719941,  0.00357748])
    ====================
    

    I found in the documentation that all you need to do to adapt this to actual least squares is to specify the function that computes the residuals.

    # ===============================================
    # FORMULATION #2: a special least squares problem
    
    # Here all that is needeed is a function that computes the vector of residuals
    # the optimization function takes care of the rest
    def least_squares_residuals(coeff, data, target):
        """
        Function that returns the vector of residuals between the predicted values
        and the target value. Here we want each predicted value to be close to zero
        """
        A, B, C = coeff
        x, y, z = data.T
        prediction = our_function(coeff, data)
        vector_of_residuals = (prediction - target)
        return vector_of_residuals
    
    
    # Here the bounds are specified in the optimization call
    bound_gt = np.full(shape=num_coeff, fill_value=0, dtype=np.float)
    bound_lt = np.full(shape=num_coeff, fill_value=np.inf, dtype=np.float)
    bounds = (bound_gt, bound_lt)
    
    lst_sqrs_result = scipy.optimize.least_squares(least_squares_residuals, coeff_0,
                                                   args=(data, target), bounds=bounds)
    # Test what the squared error of the returned result is
    coeff = lst_sqrs_result.x
    lst_sqrs_output = our_function(coeff, data)
    print('====================')
    print('lst_sqrs_result =\n%s' % (lst_sqrs_result,))
    print('---------------------')
    print('lst_sqrs_output = %r' % (lst_sqrs_output,))
    print('====================')
    

    The output here is:

    ====================
    lst_sqrs_result =
     active_mask: array([ 0, -1,  0])
            cost: 6.197329866927735e-05
             fun: array([ 0.00518416, -0.00564099, -0.00710112,  0.00385024])
            grad: array([ -4.61826888e-09,   3.70771396e-03,   1.26659198e-09])
             jac: array([[-0.72611025, -0.27388975,  0.13653112],
           [-0.74479565, -0.25520435,  0.1644325 ],
           [-0.35777232, -0.64222767,  0.11601263],
           [-0.77338046, -0.22661953,  0.27104366]])
         message: '`gtol` termination condition is satisfied.'
            nfev: 13
            njev: 13
      optimality: 4.6182688779976278e-09
          status: 1
         success: True
               x: array([  2.46392438e-01,   5.39025298e-17,   1.71555150e+00])
    ---------------------
    lst_sqrs_output = array([ 0.00518416, -0.00564099, -0.00710112,  0.00385024])
    ====================