Search code examples
pythonlambdacurve-fittingleast-squares

Python lambda function with arrays as parameters


I am trying to define a function of n variables to fit to a data set. The function looks like this.

Kelly Function

I then want to find the optimal ai's and bj's to fit my data set using scipy.optimize.leastsq

Here's my code so far.

from scipy.optimize import leastsq 
import numpy as np

def kellyFunc(a, b, x): #Function to fit.
  top = 0
  bot = 0
  a = [a]
  b = [b]
  for i in range(len(a)):
    top = top + a[i]*x**(2*i)
    bot = bot + b[i]*x**(2*i)
  return(top/bot)


def fitKelly(x, y, n):
  line = lambda params, x : kellyFunc(params[0,:], params[1,:], x) #Lambda Function to minimize
  error = lambda params, x, y : line(params, x) - y #Kelly - dataset

  paramsInit = [[1 for x in range(n)] for y in range(2)] #define all ai and bi = 1 for initial guess

  paramsFin, success = leastsq(error, paramsInit, args = (x,y)) #run leastsq optimization

  #line of best fit
  xx = np.linspace(x.min(), x.max(), 100)
  yy = line(paramsFin, xx)

  return(paramsFin, xx, yy)

At the moment it's giving me the error:
"IndexError: too many indices" because of the way I've defined my initial lambda function with params[0,:] and params[1,:].


Solution

  • There are a few problems with your approach that makes me write a full answer.

    As for your specific question: leastsq doesn't really expect multidimensional arrays as parameter input. The documentation doesn't make this clear, but parameter inputs are flattened when passed to the objective function. You can verify this by using full functions instead of lambdas:

    from scipy.optimize import leastsq           
    import numpy as np
    
    def kellyFunc(a, b, x): #Function to fit.
      top = 0
      bot = 0
      for i in range(len(a)):
        top = top + a[i]*x**(2*i)
        bot = bot + b[i]*x**(2*i)
      return(top/bot)
    
    def line(params,x):
      print(repr(params)) # params is 1d!
      params = params.reshape(2,-1) # need to reshape back
      return kellyFunc(params[0,:], params[1,:], x)
    
    def error(params,x,y):
      print(repr(params)) # params is 1d!
      return line(params, x) - y # pass it on, reshape in line()
    
    def fitKelly(x, y, n):
      #paramsInit = [[1 for x in range(n)] for y in range(2)] #define all ai and bi = 1 for initial guess
      paramsInit = np.ones((n,2)) #better
      paramsFin, success = leastsq(error, paramsInit, args = (x,y)) #run leastsq optimization
    
      #line of best fit
      xx = np.linspace(x.min(), x.max(), 100)
      yy = line(paramsFin, xx)
    
      return(paramsFin, xx, yy)
    

    Now, as you see, the shape of the params array is (2*n,) instead of (2,n). By doing the re-reshape ourselves, your code (almost) works. Of course the print calls are only there to show you this fact; they are not needed for the code to run (and will produce bunch of needless output in each iteration).

    See my other changes, related to other errors: you had a=[a] and b=[b] in your kellyFunc, for no good reason. This turned the input arrays into lists containing arrays, which made the next loop do something very different from what you intended.

    Finally, the sneakiest error: you have input variables named x, y in fitKelly, then you use x and y is loop variables in a list comprehension. Please be aware that this only works as you expect it to in python 3; in python 2 the internal variables of list comprehensions actually leak outside the outer scope, overwriting your input variables named x and y.