Search code examples
pythonbilinear-interpolation

Bilinear Fit on data using python


I'm trying to fit a number depending on two inputs via a bilinear fit using python.

I have random sets of data of x and y inputs and the corresponding z output.

I need the output coefficients a, b, c, and d for

z(x,y) = a*x + b*y + c*x*y + d

The input data I have is not a quadratic "grid" It might have more elements in the x direction than in the y space. Furthermore, there might be values missing. So basically, I have a random set of x and y pairs that I need to find bilinear cofficients for to match the given data as good as possible.

I already tried:

  • from sklearn import linear_model: I realized too late, that this does not allow a bilinear fit.
  • from scipy.interpolate import LinearNDInterpolator: This does work, but somehow does not allow using the full input range of parameters and I found no way of actually getting any coefficients out of it.

I could calculate the bilinear coefficients based on 4 points manually, but this wouldn't give me the best result.

Any ideas how to approach this problem? I'm completely stuck. Searching for "bilinear fit python" did not turn up much useful information.

As requested. A little sample data:

x       y   z
1056    8   50.89124679
1056    16  61.62827273
1056    32  78.83079982
1056    48  92.90073197
1056    64  105.103744
1056    80  116.0303753
1056    96  126.0130906
1056    112 135.2610439
1056    128 143.9159512
1056    144 152.0790946
2048    8   63.71675604
2048    16  77.15971099
2048    32  98.69757849
2048    48  116.313387
2048    64  131.5917779
2048    80  145.2721136
2048    96  157.7706532
2048    112 169.3492575
2048    128 180.1853546
2048    144 190.4057615
4096    8   86.7357654
4096    16  105.0352703
4096    32  134.3541477
4096    48  158.334033
4096    64  179.1320602
4096    80  197.7547066
4096    96  214.7686034
4096    112 230.5302193
4096    128 245.2810877
4096    144 259.193829

Solution

  • Least squares fit. (Sorry: home-grown, distinctly Fortran-like solution! I expect there's a better library function somewhere in scipy.):

    Form S = sum( ( ax+by+cxy+d - z )^2 )

    Set dS/da = dS/db = dS/dc = dS/dd = 0 and solve the resulting matrix equation for [a,b,c,d].

    You can uncomment the test values to see if it works. It's far from a perfect fit to your data.

    a =  0.012245406171250679
    b =  0.5585392222297886
    c =  0.0001676173599609264
    d =  40.804478822630024
            x          y          z        fit      error  
     1056.000      8.000     50.891     59.620      8.729  
     1056.000     16.000     61.628     65.504      3.876  
     1056.000     32.000     78.831     77.273     -1.558  
     1056.000     48.000     92.901     89.042     -3.859  
     1056.000     64.000    105.104    100.810     -4.293  
     1056.000     80.000    116.030    112.579     -3.451  
     1056.000     96.000    126.013    124.348     -1.665  
     1056.000    112.000    135.261    136.116      0.855  
     1056.000    128.000    143.916    147.885      3.969  
     1056.000    144.000    152.079    159.654      7.575  
     2048.000      8.000     63.717     73.098      9.381  
     2048.000     16.000     77.160     80.312      3.152  
     2048.000     32.000     98.698     94.741     -3.956  
     2048.000     48.000    116.313    109.170     -7.143  
     2048.000     64.000    131.592    123.600     -7.992  
     2048.000     80.000    145.272    138.029     -7.243  
     2048.000     96.000    157.771    152.458     -5.313  
     2048.000    112.000    169.349    166.887     -2.462  
     2048.000    128.000    180.185    181.316      1.131  
     2048.000    144.000    190.406    195.745      5.339  
     4096.000      8.000     86.736    100.922     14.187  
     4096.000     16.000    105.035    110.883      5.848  
     4096.000     32.000    134.354    130.805     -3.549  
     4096.000     48.000    158.334    150.726     -7.608  
     4096.000     64.000    179.132    170.648     -8.484  
     4096.000     80.000    197.755    190.570     -7.185  
     4096.000     96.000    214.769    210.491     -4.277  
     4096.000    112.000    230.530    230.413     -0.117  
     4096.000    128.000    245.281    250.334      5.053  
     4096.000    144.000    259.194    270.256     11.062 
    

    Code:

    import numpy as np
    
    def bilinear_fit( D ):
        N = len( D )
        
        Sx, Sxx, Sy, Syy, Sxy, Sxxy, Sxyy, Sxxyy = 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0
        Sz, Sxz, Syz, Sxyz = 0.0, 0.0, 0.0, 0.0
    
        for i in range( N ):
            x, y, z = D[i]
            Sx    += x
            Sxx   += x * x
            Sy    += y
            Syy   += y * y
            Sxy   += x * y
            Sxxy  += x * x * y
            Sxyy  += x * y * y
            Sxxyy += x * x * y * y
            Sz    += z
            Sxz   += x * z
            Syz   += y * z
            Sxyz  += x * y * z
        
        A = np.array( [ [ Sxx , Sxy , Sxxy , Sx  ],
                        [ Sxy , Syy , Sxyy , Sy  ],
                        [ Sxxy, Sxyy, Sxxyy, Sxy ],
                        [ Sx  , Sy  , Sxy  , N   ] ] )
        RHS = np.array( [ Sxz, Syz, Sxyz, Sz ] )
    
        return np.linalg.solve( A, RHS )             # returns [ a, b, c, d ]
        
    
    
    
    # Given data
    D = [ 
          [ 1056,   8   ,  50.89124679 ],
          [ 1056,   16  ,  61.62827273 ],
          [ 1056,   32  ,  78.83079982 ],
          [ 1056,   48  ,  92.90073197 ],
          [ 1056,   64  ,  105.103744  ],
          [ 1056,   80  ,  116.0303753 ],
          [ 1056,   96  ,  126.0130906 ],
          [ 1056,   112 ,  135.2610439 ],
          [ 1056,   128 ,  143.9159512 ],
          [ 1056,   144 ,  152.0790946 ],
          [ 2048,   8   ,  63.71675604 ],
          [ 2048,   16  ,  77.15971099 ],
          [ 2048,   32  ,  98.69757849 ],
          [ 2048,   48  ,  116.313387  ],
          [ 2048,   64  ,  131.5917779 ],
          [ 2048,   80  ,  145.2721136 ],
          [ 2048,   96  ,  157.7706532 ],
          [ 2048,   112 ,  169.3492575 ],
          [ 2048,   128 ,  180.1853546 ],
          [ 2048,   144 ,  190.4057615 ],
          [ 4096,   8   ,  86.7357654  ],
          [ 4096,   16  ,  105.0352703 ],
          [ 4096,   32  ,  134.3541477 ],
          [ 4096,   48  ,  158.334033  ],
          [ 4096,   64  ,  179.1320602 ],
          [ 4096,   80  ,  197.7547066 ],
          [ 4096,   96  ,  214.7686034 ],
          [ 4096,   112 ,  230.5302193 ],
          [ 4096,   128 ,  245.2810877 ],
          [ 4096,   144 ,  259.193829  ]   ]
    
    
    # Alternative test data
    # for i in range( len( D ) ):
    #     D[i][2] = 2 * D[i][0] + 3 * D[i][1] + 4 * D[i][0] * D[i][1] + 5
    
    
    a, b, c, d = bilinear_fit( D )
    print( "a = ", a )
    print( "b = ", b )
    print( "c = ", c )
    print( "d = ", d )
    
    fmthead = 5 * "{:>9}  "
    fmtdata = 5 * "{:9.3f}  "
    print( fmthead.format( "x", "y", "z", "fit", "error" ) )
    for i in range( len( D ) ):
        x, y, z = D[i]
        prediction = a * x + b * y + c * x * y + d
        error = prediction - z
        print( fmtdata.format( x, y, z, prediction, error ) )