Solved – Tool to confirm Gaussian fit

curve fittingleast squares

I have a series of (x1,y1) points. I'm using a 3rd party software tool to which I feed these points. The tool then provides a mechanism for me to get back a series of (x2,y2) points that are on a Gaussian curve that has been fit to the data.

I'd like to confirm that the points I'm getting back are correct (because, frankly, sometimes they don't look like they are.)

What would be an easy on-line or Excel procedure that would allow me to enter the X1, Y1 series and then get back a series of X2, Y2 points, plot both series, and maybe even see the s.d.?

It's been 20+ years since I did any stats, so I'm just "cook-booking" this with no understanding of what I'm trying to do.

EDIT

The is a plot by wavelength, of intensity. X-axis is the colors of a rainbow, in the familiar ROYGBIV order. The Y-axis is measured intensity of that particular color.

The help page from the vendor is here: http://www.lohninger.com/helpcsuite/calcgaussfit.htm

I call the library with each x1, y1 data point. I then call the method above. I then call the library again with a series of x values (or maybe the x1 series itself), retrieving the y2 values on the fitted curve. I also retrieve the sd of the fit.

EDIT 2

Below is a link to an image of the dataset and Gaussian fit that has me concerned. I think that the peak of the fitted curve is too low.

GraphImage](http://www.flickr.com/photos/10505025@N08/)

Best Answer

Your fitting method uses least squares. To check it, set up four parallel columns in the spreadsheet:

  • X has the x-values.
  • Y has the y-values.
  • Fit computes the Gaussian values (based on the x-values and three parameters).
  • Residual is the difference between the y-values and the fits.

In order to compute the fit, you need to create three cells holding the three gaussian parameters. The formula for the fit must be identical to that used by the other software so you can compare your results with its. The example below names the three parameters kappa0, kappa1, and kappa2--just as in the documentation. The formula in the second row, where the x-value is in cell A2, is

=Kappa0 * EXP(-1*(A2 - Kappa1)^2 / Kappa2)

It is copied down to all the other rows.

This is enough to check the software's results simply by plugging in its reported values of kappa0, kappa1, and kappa2. To see whether they are correct, compute the sum of squared residuals (SSR). A formula for this uses the SUMSQ function ("=SUMSQ(D2:D32)" in the example). If you think a better combination of the parameters will work, plug in that new combination and see whether SSR decreases: if it goes down, the new values are better.

Spreadsheet

You can have this automated for you using Excel's "Solver" tool. Specify that you want to minimize the SSR by varying the three kappas. Start with the solution given by the other software. Solver will try systematically to improve its solution.

Solver dialog

The same method--suitably adapted--works well for least squares, maximum likelihood, and other optimization procedures in statistics, provided the objective function is well-behaved (i.e., differentiable and convex) and you can obtain an excellent starting value. Otherwise, Solver is perfectly capable of reporting inferior solutions or failing altogether: it is wise not to use it as the sole method to solve a problem.

Related Question