Nonlinear Regression – How to Assess Model Fit for Non-Linear Regression

multiple regressionnonlinear regressionregression

I am looking at non linear regression. Below is some example output from a non linear regression using MATLAB. There are also two links below this output from the Minitab website. The links explain why for non linear regression $p$-values and the $R^2$ are not valid.

So my question is what should I look for in my results from a non linear regression? How can I tell if the overall model fit is reasonable & the coefficients are significant without using $p$-values and the $R^2$?

mdl = 
Nonlinear regression model:
y ~ p1*cos(p2*xdata) + p2*sin(p1*xdata)

Estimated Coefficients:
      Estimate             SE                 
p1      1.8818508110535      0.027430139389359
p2    0.700229815076442    0.00915260662357553

      tStat               pValue              
p1    68.6052223191956    2.26832562501304e-12
p2    76.5060538352836    9.49546284187105e-13

Number of observations: 10, Error degrees of freedom: 8
Root Mean Squared Error: 0.082
R-Squared: 0.996,  Adjusted R-Squared 0.995
F-statistic vs. zero model: 1.43e+03, p-value = 6.04e-11

Best Answer

Coefficent p-values and R2 values aren't particularly good measures of how useful a model is, so how should you evaluate the usefulness of a model? It turns out that this rabbit hole goes very deep, not least because there are different ways in which a model can be useful, and a model well-suited for one purpose may be totally inadequate for another.

One approach, as you suggested in the title, is to examine goodness of fit; that is, roughly, to measure how much variability is left in the DVs after the model explains all the variability it can in terms of variability in the IVs (the less residual variability in the DVs, the better the fit). Fit is measured by quantities such as mean square error and likelihood. On their own, these fit measures are often not particularly enlightening. How good is a log-likelihood of −20, anyway? You can put fit measures in context by trying some other models, perhaps some simpler than your model of interest and some more complex, and looking at the differences in fit. If the simpler models achieve substantially worse fit, and the increase in fit afforded by the more complex models doesn't seem worth the increase in complexity, then your model is doing a good job of describing the data—compared to the other models, anyway. This process of model comparison and model selection can be more formalized, with techniques ranging in complexity from the Akaike information criterion to fully Bayesian model selection.

Another approach focuses on prediction. The model is construed as a tool for predicting DV values given only IV values, and judged on the accuracy of these predictions. Mean square error is often a good measure of predictive accuracy. The important thing, to avoid inflating your estimates of predictive accuracy, is that you train and test the model with separate data, or use an equivalent technique such as cross-validation. Although predictive accuracy is generally more interpretable in isolation than goodness of fit, comparing your model's accuracy to that of others, particularly a trivial model of some kind, is usually helpful.

Related Question