Search code examples
matlabhistogramcurve-fittingnormal-distributionbinning

Specifying bin edges when fitting a normal distribution using histfit


I want to fit a histogram to some data using predefined bins. All my data points are between 1 and 10, so I want the bins to start from xmin=1, and end at xmax=10, with a step of 0.5.

I use the following commands:

x = d1.data(:,4); % x is my data
H = histfit(x,10,'normal'); % fits a histogram using 10 bins 

However when doing the above, bins are determined automatically per dataset and do not correspond to the edges I want. How can I ensure that the same bin edges are used for all datasets?


Solution

  • If you have access to the Curve Fitting Toolbox, I would suggest another approach that provides the required flexibility. This involves doing the fit "yourself" instead of relying on histfit:

    % Generate some data:
    rng(66221105) % set random seed, for reproducibility
    REAL_SIG = 1.95;
    REAL_MU = 5.5;
    X = randn(200,1)*REAL_SIG + REAL_MU;
    
    % Define the bin edges you want
    EDGES = 1:0.5:10;
    
    % Bin the data according to the predefined edges:
    Y = histcounts(X, EDGES);
    
    % Fit a normal distribution using the curve fitting tool:
    binCenters = conv(EDGES, [0.5, 0.5], 'valid'); % moving average
    [xData, yData] = prepareCurveData( binCenters, Y );
    
    ft = fittype( 'gauss1' );
    fitresult = fit( xData, yData, ft );
    disp(fitresult); % optional
    
    % Plot fit with data (optional)
    figure(); 
    histogram(X, EDGES); hold on; grid on;
    plot(fitresult); 
    

    Which yields the following plot:

    enter image description here

    and the fitted model:

     General model Gauss1:
     fitresult(x) =  a1*exp(-((x-b1)/c1)^2)
     Coefficients (with 95% confidence bounds):
       a1 =       19.65  (17.62, 21.68)
       b1 =        5.15  (4.899, 5.401)
       c1 =       2.971  (2.595, 3.348)