MATLAB: How to fit a Gauss curve in the tails of a curve

fit gaussianStatistics and Machine Learning Toolbox

I need to fit a Gauss curve to the tails of curve as shown in the picture, to calculate the area below the curve. I tried to use the code below without success. I'm using the wrong method ?

There is a better way to do it ?

I successfully did it for a Gaussian curve using the examples in the page https://www.mathworks.com/help/curvefit/fitoptions.html, shown in the picture. However, when I applied the same method to my curve it didn't work.

I tried using both, the std from the original curve and the std from a fitted Gaussian curve.

Here is my code (I attached the gdata.m file):

load('gdata.mat');
x = (1:1:129)';
% mystd = std(gdata)
gfit1= fit(x,gdata,'gauss2');
YHat = gfit1(x);
mystd = std(YHat)
gfit2= fit(x,gdata,'gauss2','Exclude', gdata > (2*mystd));

Best Answer

If the data typically have such a discontinuity at the tails, a technique we used in the old online "SulfurMeter" which was a gamma-spec device for measuring elemental S in coal for power plant emissions control was something similar to the following illustration --

First a replotting of your data for a look-see more closely...

Note the sharpness of the discontinuity is enhanced by the first derivative. Occasionally, the second will be even more helpful but here it seems the inevitable additional noise is more than the enhancement and the location appears the same anyway...so, now what? Let's try fitting a breakpoint to the lefthand section--

>> [mxL,estL]=max(d);       % left tail max location, magnitude
>> cest=[0 0 estL-5 mxL]    % estimate for fitting breakpoint by piecewise linear fit
cest =
         0           0          37        8546
>> coeffL=nlinfit(1:estL,d(1:estL),@piecewise,cest)  % do the fit
coeffL =
 1.0e+03 *
 -0.0914    0.0123    0.0381    2.2369
>> [coeffL(3) round(coeffL(3))]   % the third coefficient is the breakpoint location
ans =
 38.0916   38.0000
>>

How well did it work?

>> figure
>> plot(d(1:Lest))
>> hold on
>> plot(1:Lest,piecewise(coeffL,1:Lest),'r-')
>> legend('1st Diff','Piecewise','location','northwest')

Looks pretty good. You can do the same thing from the left except use min and work from the RH end to the front in selecting the RH locations. I haven't done it here but you'll probably want to "add one" to the returned location of the breakpoint for use with the raw data since the difference vector is one element shorter--or, you can augment/prepend the difference vector with the mean of the first few elements or similar before fitting.

NB: The initial estimates are

0,0    -- first section intercept, slope
estL-5 -- breakpoint at arbitrary point a few positions off the max
mxL    -- slope for second section

The breakpoint location estimate may need some fine-tuning on setting the distance from the located maximum; it does need to be at least "reasonably close" to the knee you're trying to locate to get a good fit.

The slope is an approximation to the slope of the second line; it would be approximately (max-mean)/(xmax-startx) if were to try to actually compute; I just assumed only a width of 1 point and zero mean so it was (max-0)/(1). As you'll note, the fitted line slope is roughly 1/4 that so dividing by the offset number may make solution a little quicker but typically the solution is not terribly sensitive to this initial guess if the location is reasonable.

Another "trick" we used in the S-meter was to do an iterative fitting of subtracting a baseline then refitting the residuals.

piecewise is my utility function for the purpose--

function y=piecewise(coef,x)
% return piecewise linear fit given a1,b1,c,b2 as coefficient array and vector x
% c is breakpoint, a1 is intercept of first section, b1,b2 are two segment slopes
a1=coef(1); b1=coef(2);  % reduce parentheses clutter....
c=coef(3); b2=coef(4);
ix=x>c;                  % location > breakpoint
y(ix)=[a1+c*(b1-b2)+b2*x(ix)];
y(~ix)=[a1+b1*x(~ix)];
y=y(:);
>>

Related Solutions

MATLAB: How to fit a curve to this plot

(continuing from the comments under the question)

To get a good fit, you need to make sure you're working with the correct function with the necessary parameters and starting off with good initial guesses for the parameter values. It looks like your data are sigmoidal so fitting the data to a sigmoid is a good start!

From Wiki (link),

e = the natural logarithm base (also known as Euler's number),
x0 = the x-value of the sigmoid's midpoint,
L = the curve's maximum value, and
k = the steepness of the curve.

You may have to add additional parameters that scale the sigmoid to your data.

I'd code that function up in matlab, plot it out, and play around with the parameters until it looks like your data. Then I'd fit your data to that function with the initial parameter guesses set to whatever you decided was similar to your data.

Curve fitting is a process and there's never really 1 solution to any set of noisy data. Play around with it and always assess the error of your fit.

MATLAB: Can I do a curve fit tool in a for loop

 The model coefficients for a Gaussian fit can be found out directly from the model. Refer to the following code on how to get the coefficients:

f = fit(x,y,'gauss2')

Also, to see how to use gaussian model to fit the data and some examples on how to use it here.

Best Answer

Related Solutions

MATLAB: How to fit a curve to this plot

MATLAB: Can I do a curve fit tool in a for loop

Related Question