Search code examples
pythonjupyter-notebookgaussianastropy

How to fit a Gaussian using Astropy


I am trying to fit a Gaussian to a set of data points using the astropy.modeling package but all I am getting is a flat line. See below:

https://i.sstatic.net/H37VC.png

https://i.sstatic.net/DOKSd.png

Here's my code:

%pylab inline
from astropy.modeling import models,fitting
from astropy import modeling

#Fitting a gaussian for the absorption lines
wavelength= linspace(galaxy1_wavelength_extracted_1.min(),galaxy1_wavelength_extracted_1.max(),200)
g_init = models.Gaussian1D(amplitude=1., mean=5000, stddev=1.)
fit_g = fitting.LevMarLSQFitter()
g = fit_g(g_init, galaxy1_wavelength_extracted_1, galaxy1_flux_extracted_1)

#Plotting 
plot(galaxy1_wavelength_extracted_1,galaxy1_flux_extracted_1,".k")
plot(wavelength, g(wavelength))
xlabel("Wavelength ($\\AA$)")
ylabel("Flux (counts)")

What am I doing wrong or missing?


Solution

  • I made some fake data that sort of resembles yours, and tried running your code on it and obtained similar results. I think the problem is that if you don't adjust your model's initial parameters to at least sort of resemble the original model, or else the fitter won't be able to converge no matter how many rounds of fitting it performs.

    If I'm fitting a Gaussian I like to give the initial model some initial parameters based on computationally "eyeballing" them like so (here I named your real data's flux and wavelength as orig_flux and orig_wavelength respectively):

    >>> an_amplitude = orig_flux.min()
    >>> an_mean = orig_wavelength[orig_flux.argmin()]
    >>> an_stddev = np.sqrt(np.sum((orig_wavelength - an_mean)**2) / (len(orig_wavelength) - 1))
    >>> print(f'mean: {an_mean}, stddev: {an_stddev}, amplitude: {an_amplitude}')
    mean: 5737.979797979798, stddev: 42.768052162734605, amplitude: 84.73925092448636
    

    where for the standard deviation I used the unbiased standard deviation estimate.

    Plotting this over my fake data shows that these are reasonable values I might have picked if I manually eyeballed the data as well:

    >>> plt.plot(orig_wavelength, orig_flux, '.k', zorder=1)
    >>> plt.scatter(an_mean, an_amplitude, color='red', s=100, zorder=2)
    >>> plt.vlines([an_mean - an_stddev, an_mean + an_stddev], orig_flux.min(), orig_flux.max(),
    ...            linestyles='dashed', colors='gg', zorder=2)
    

    enter image description here

    One feature I've wanted to add to astropy.modeling in the past is optional methods that can be attached to some models to give reasonable estimates for their parameters based on some data. So for Gaussians such a method would return much like I just computed above. I don't know if that's ever been implemented though.

    It is also worth noting that your Gaussian would be inverted (with a negative amplitude) and that it's displaced on the flux axis some 120 points, so I added a Const1D to my model to account for this, and subtracted the displacement from the amplitude:

    >>> an_disp = orig_flux.max()
    >>> g_init = (
    ...     models.Const1D(an_disp) +
    ...     models.Gaussian1D(amplitude=(an_amplitude - an_disp), mean=an_mean, stddev=an_stddev)
    ... )
    >>> fit_g = fitting.LevMarLSQFitter()
    >>> g = fit_g(g_init, orig_wavelength, orig_flux)
    

    This results in the following fit which looks much better already:

    >>> plt.plot(orig_wavelength, orig_flux, '.k')
    >>> plt.plot(orig_wavelength, g(orig_wavelength), 'r-')
    

    enter image description here

    I'm not an expert in modeling or statistics, so someone with deeper knowledge could likely improve on this. I've added a notebook with my full analysis of the problem, including how I generated my sample data here.