[Math] Confidence Interval for Nonlinear Regression using F-Test – lmfit

regressionstatistical-inferencestatistics

I am trying to understand the implementation for the lmfit confidence interval calculation – in the docs it is stated:

"The F-test is used to compare our null model, which is the best fit we have found, with an alternate model, where one of the parameters is fixed to a specific value. The value is changed until the difference between $\chi_0^2$ and $\chi_f^2$ can’t be explained by the loss of a degree of freedom within a certain confidence.

$$
F(P_{fix}, N-P) = \bigg( \frac{\chi_f^2}{\chi_0^2} – 1 \bigg) \frac{N-P}{P_{fix}}
$$

N is the number of data-points, P the number of parameter of the null model. $P_{fix}$ is the number of fixed parameters (or to be more clear, the difference of number of parameters between our null model and the alternate model)".

My first question is how are $\chi_0^2$ and $\chi_f^2$ calculated?

Second is "The value is changed until the difference .. can’t be explained by the loss of a degree of freedom within a certain confidence": I am not sure how this "can't be explained" action would be implemented in terms of statistics.

I have statistics knowledge, am familiar with common tests, how they are implemented. The statements above confused me however 🙂

Any explanation and/or links to concepts / examples implementing similar techniques would be great.

Thanks!

Best Answer

I think we had some interaction on Stack Overflow a few days ago regarding F-stats. The concept of the F-test is fairly widespread in statistics, so it is not an innovation by Michael Newville (lmfit developer). Basically there are two types of F-tests to perform. Both of which deal with comparing two models, because an F is simply a ratio of chi-squares from those two models. F = chi^2(mod2)/chi^2(mod1). Should really use reduced chi^2's. By definition, the best fit model chi-squared always goes into the denominator, so the F is greater than 1.

Here's how it works in the two cases, neglecting the need to type ways of calculating F's based on a probability distribution function or any math - just a simple thought exercise.

Case one: You have two models, with different numbers of parameters. Both fit the data well. Which one should you use? Are they statistically different from one another? How statistically significant? One sigma, two, three or some percentage?

In every model, you will have some number of degrees of freedom (DOF) which depends on the number of parameters in the fit, and the number of data points that you are fitting. A model with more parameters will always fit better than one with fewer parameters, but how do you justify using the extra parameter (or lackthereof) if both fit well? This can be accomplished by an F-test. That is what is meant when Michael writes "the difference .. can’t be explained by the loss of a degree of freedom within a certain confidence".

The answer is to calculate the F for the two fits, and compare it to a lookup table based on a probability distribution function (Bevington for Physical Sciences has a decent section on F's). The greater the F is from 1, the more significant it is that the addition of the extra parameter led to a statistically better fit. The lookup table F will depend on the number of degrees of freedom for your two models (again, because more parameters will always give you a better fit, so you need to take this into account). If the F is 1, there is obviously no difference (intuitive).

Case two: Calculating confidence intervals. This can be visualized by imagining an error surface. You find your minimum from your fit, and that corresponds to the lowest chi^2 and a set of best-fit parameters. You now want to know the confidence limits of those parameters, but you can't always assume the error surface is Gaussianly distributed (parabola shape). They can be asymmetric (as per your stack overflow question ;)).

To calculate the interval, you simply move one of your parameters at a time away from the best fit value (in both directions). Your fit will always be worse, and you'll be moving up the error surface (greater chi^2), and your F for comparing those two models will increase. You do this iteratively, but when do you stop? You stop based on the confidence level you want to know your parameter. That limit will again depend on a probability distribution function corresponding to the numbers of degrees of freedom in your two models. So let's say you want to know your 95% confidence limit for a parameter who's value is 4. You know that the 95% limit corresponds to a F of 1.213 based on your DOF's and the probability distribution function. Then you simply move the parameter value (fixing it, that's pfix in lmfit), away from the best fit incrementally. Maybe moving to 5 worsens your fit, giving you a 1.1 F when compared to the best fit, but it's not at your target of 1.213 yet, so you keep going. You move it to 7.4, and you find the F is 1.213, corresponding to the target, so that is your 95% confidence limit in the positive direction. You can do the same in the opposite direction of 4, but there is no guarantee the 95% limit will be 4 minus 2.4 (which would be symmetric) in this example.

Hope this helps

Related Question