Search code examples
python-3.xscipyargumentsparameter-passingminimization

How do I pass through arguments to other functions (generally and via scipy)?


I am trying to minimize a function that outputs chi-square via scipy and find the mu,sigma,normc that provide the best fit for a Gaussian overlay.

from math import exp
from math import pi
from scipy.integrate import quad
from scipy.optimize import minimize
from scipy.stats import chisquare
import numpy as np

# guess intitial values for minimized chi-square
mu, sigma = np.mean(mydata), np.std(mydata) # mydata is my data points
normc = 1/(sigma * (2*pi)**(1/2)) 

gauss = lambda x: normc * exp( (-1) * (x - mu)**2 / ( 2 * (sigma **2) ) ) # Gaussian Distribution

# assume I have pre-defined bin-boundaries as a list called binbound

def expvalperbin(binbound,mu,sigma,normc):
    # calculates expectation value per bin
    ans = []
    for index in range(len(binbound)):
        if index != len(binbound)-1:
            ans.append( quad( gauss, binbound[index], binbound[index+1])[0] )
    return ans

expvalguess = expvalperbin(binbound,mu,sig,normc)
obsval = countperbin(binbound,mydata)
arglist = [mu,sig,norm]

def chisquareopt(obslist,explist):
    return chisquare(obslist,explist)[0]

chisquareguess = chisquareopt((obsval,expvalguess), expvalguess, args=arglist)

result = minimize( chisquareopt(obsval,expvalguess), chisquareguess   )
print(result)

Running this code provides me with this error:

TypeError: chisquareopt() got an unexpected keyword argument 'args'

I have a few questions:

1) How can I write a function to allow arguments to be passed through to my function chisquareopt?

2) How can I tell if scipy will optimize parameters [mu, sigma, normc] that give the minimum chi-square? How could I find these parameters from the optimization?

3) It is difficult to know if I'm making progress here or not. Am I on the right track?

EDIT: If it is relevant, I have a function that inputs [mu, sigma, normc] and outputs a list of sublists, each sublist containing a possible combination of [mu, sigma, normc] (where the outer list covers all possible combinations of parameters within specified ranges).


Solution

  • I've simplified your problem somewhat to give you an idea on your question 2).

    First, I've hard-coded your histogram obslist and the number of data points N as global variables (that simplifies the function signatures a little). Second I've hard-coded the bin boundaries in expvalperbin, assuming 9 bins with fixed width 5 and the first bin starts at 30 (so the histogram ranges from 30 to 75).

    Third, I'm using optimize.fmin (Nelder-Mead) instead of optimize.minimize. The reason for using fmin instead of minimize is that the passing of additional parameters via args=(x,y) doesn't seem to work in the sense that the additional parameters are kept at the fixed values from the very first invocation. That's not what you want: you want to optimize over mu and sigma simultaneously.

    Given these simplifications we have the following (surely very unpythonic) script:

    from math import exp
    from math import pi
    from scipy.integrate import quad
    from scipy.optimize import fmin
    from scipy.stats import chisquare
    
    
    obslist = [12, 51, 144, 268, 264, 166, 75, 18, 2] # histogram, 1000 observations
    N = 1000 # no. of data points
    
    
    def gauss(x, mu, sigma):
        return 1/(sigma * (2*pi)**(1/2)) * exp( (-1) * (x - mu)**2 / ( 2 * (sigma **2) ) )
    
    def expvalperbin(mu, sigma):
        e = []
        # hard-coded bin boundaries
        for i in range(30, 75, 5):
            e.append(quad(gauss, i, i + 5, args=(mu, sigma))[0] * N)
        return e
    
    def chisquareopt(args):
        # args[0] = mu
        # args[1] = sigma
        return chisquare(obslist, expvalperbin(args[0], args[1]))[0]
    
    # initial guesses
    initial_mu = 35.5
    initial_sigma = 14
    
    result = fmin(chisquareopt, [initial_mu, initial_sigma])
    
    print(result)
    

    Optimization terminated successfully.

    Current function value: 2.010966

    Iterations: 49

    Function evaluations: 95

    [ 50.57590239 7.01857529]

    Btw., the obslist histogram is a 1000 point random sample from a N(50.5, 7.0) normal distribution. Remember that these are my very first Python code lines, so please don't judge me on the style. I just wanted to give you an idea about the general structure of the problem.