MATLAB: Curve-fitting using the Curve Fitting Toolbox, gives wrong confidence intervals

confidence intervalscurve fittingCurve Fitting Toolboxfitstatisticstoolbox

We have a set of data that looks something like this:

What we are trying to do is by using the curve fitting toolbox, apply a polynomial curve and plot the curve together with the confidence interval. Sounds simple right?

The MATLAB "fit" function returns a cfit object. The object has the following output:

fitresult =
      Linear model Poly2:
    fitresult(x) = p1*x^2 + p2*x + p3
    Coefficients (with 95% confidence bounds):
      p1 =    -0.01183  (-0.01667, -0.006986)
      p2 =       1.337  (1.088, 1.585)
      p3 =      -45.94  (-48.62, -43.25)

I'm seeing a problem there, the p3 value interval is a bit small, and by plotting it you can see how weird it looks:

It looks "backwards" and I'm not sure why. Does anyone know why?

Best Answer

Without your data, we cannot verify what you have done, or verify the coefficients, or verify the uncertainties produced around those estimates. Without it, I cannot show you the difference between various kinds of confidence limit curves. So hopefully, I can explain what you are seeing.

So in that vacuum, I would point out that you are thinking about how those limits would look on a simple first order model, and confusing things with a quadratic model. The limits shown on the plot are NOT 95% limits around the data, or anything like that. They are limits on the predictions of the model at each point. Why is this important? Again, it is a QUADRATIC model.

Think about it in terms of the quadratic term. At the upper end of your data where distance is in the neighborhood of 40-50, what happens with the quadratic term?

p1 =    -0.01183  (-0.01667, -0.006986)

So p1 varies CONSIDERABLY. We don't know the value of p1 at all well. So depending on the value of p1, those distances at 50 get SQUARED. They contribute hugely to the predicted value, because p1 itself has a relatively large variance. How does p1 enter into the model? It is the term:

p1*x^2

Again, the lines that you see drawn are lines around the PREDICTED value of the curve, NOT lines around the data.

Down near zero, what happens? we don't really care what the value of p1 is down there. Who gives a hoot, because x is near ZERO. Square zero, and who cares what p1 is?

Think of it like this, at x = 50, x^2 is 2500. So the cntribution to the prediction for the quadratic term is

2500*[-0.01667, -0.006986]
ans =
   -41.675000000000004 -17.465000000000000

At x=50, the uncertainty is huge around the prediction due to the quadratic term. At x=0, the quadratic term is irrelevant.

So your problem is in not understanding what those confidence lines mean, and how they are computed. They are NOT confidence limits around the data in any form. And they are not quite the same as what you would expect for a simple first order linear model, so I think you may be letting your preconceived notions sway your thinking.

Related Solutions

MATLAB: Surface fitting issue when exporting polynomial

Oh, too bad! You are the 1000001'th person to make this mistake. The person before you got an all expense paid trip to Newark, New Jersey. Ok, most of their expenses were paid. Maybe some of them were paid? :)

Seriously, this is a very common mistake that people make with polynomials, or with curve fitting in general.

You see those coefficients written out to 4 digits, so you try to use them. Bad idea. A really bad idea. For example, the p50 coefficient is 0.1246. Or is it? Not exactly. In fact, there are some more digits past the 4th decimal place that are important.

When you try to evaluate that function at x = 4.8, remember that you are forming terms like this:

(0.1246 + delta)*4.8^5

for some small error delta, on the order of +/- 0.0001. Since

4.8^5
ans =
         2548

is moderately big, when you multiply that by the error you make by representing the coefficient by exactly 0.1246, instead of perhaps 0.1245785675446642 (or whatever it truly was) this introduces a serious amount of error into your prediction.

Now, do that for EVERY coefficient in that model. What happens is instead of a valid prediction, you get random garbage.

You need to use the EXACTLY estimated coefficients in the polynomial model. Not just some 4 digit approximation as it was written in the command window.

Next, I would point out that your predictions with this model are probably worth very little. Look at the predicted uncertainties in those coefficients. For example, consider the constant term.

p00 =        -375  (-842, 91.94)

Yep, -375, plus or minus something well over 400. So in fact, that constant term could easily have been zero. Or it might have been -800. Your z values were all between 0 and 1. In fact, just the error that you make by representing p00 as exactly 375, instead of some floating point number that is close to 375 is as large as the entire variation in your data.

This is just one of the problems with using high order polynomials in curve and surface fitting. And this is a moderately high order polynomial. Most of those coefficients are barely significantly different from zero, if at all. What that means is even if the model does predict your data reasonably well, it has very little meaning, and it will certainly do poorly in trying to extrapolate beyond the data.

MATLAB: How can i find angle with x-axis of a plot in matlab

p1 = 0.00053537
p2 = -8.1393e-018
p3 = -0.00044699
x=0:0.1:10
y = p1*x.^2 + p2*x + p3
angles=angle(x+j*y)

Best Answer

Related Solutions

MATLAB: Surface fitting issue when exporting polynomial

MATLAB: How can i find angle with x-axis of a plot in matlab

Related Question