Solved – How to test a nonlinear vs a linear regression model

nonlinear regressionnonlinearityregression-strategiesresiduals

I've got a panel regression model where the Ys assume a curved shape when plotted over time. A histogram of the residuals shows they are normally distributed but a residual-vs-fitted plot shows a pattern (see image 1).

enter image description here

When I log-transform the Y variable (with a scalar added to the zeros), the residuals are still normally distributed and the residual-vs-fitted plot shows an even more severe pattern (see image 2).

enter image description here

Is there an additional statistical test to determine whether a linear or nonlinear estimation technique is better for estimating this model?

Is there a better approach to dealing with the nonlinearity in my Y-variable?

Edit: thanks for the comments and thoughts!

The DV is a continuous variable (MW of wind capacity per state-year) — but as @JimBaldwin sussed out, the original dataset has a lot of zeros.
Given this, I thought about using a Poisson model, but because I've set this up as a dynamic spatial panel, I'm not sure how I would do that.
I've estimated the r-v-f plots using a pooled OLS regression and my main models are using xsmle in Stata, which takes an ML approach to estimation.
I've also tried specifying a squared term and an inverse squared term in the model (based on the Ladder test), but these don't improve the residual-vs-fitted plots very much nor does adding these terms improve the AIC.
A histogram of the residuals finds they are normally distributed, but not centered on zero (see image 3).

Edit 2: Added the model as an image. Dynamic spatial panel

Histogram of residuals

Best Answer

This is perhaps more of a comment than an answer, but I am not allowed to comment. This comment is meant to be complementary to the existing answer and comments.

Nonlinear regression (least squares) model is generally taken to mean that the model is nonlinear in the parameters (nonlinear in at least one parameter anyway). As exemplified in Ekaba Bisong's answer, another possibility is to have a linear regression model having terms which are nonlinear in one or more independent variables, but linear in the parameters. Either way may fit a nonlinear relationship between the dependent variable and one or more independent variables. Therefore, linear regression vs. nonlinear regression is not even the right way of thinking about it. Rather, linear vs. nonlinear relationship between dependent variable and independent variables is what you really "want" to be asking.

A nonlinear regression model allows for additional flexibility in the form of nonlinear relationship between the dependent variable and the independent variables than does use of a linear regression model which adds terms which are nonlinear in the independent variables but linear in the parameters.

One more thing to think about is the probability distribution of the errors. For instance, consider the model $y = a \exp(bx)$. This can be solved as a nonlinear regression model, or the logarithm of both sides can be taken and solved as a linear regression problem $\ln y = \ln a + b x$. These two models are NOT equivalent, despite frequent claims that they are (i.e., that the linear least squares version is "doing" nonlinear least squares). The errors in the linear regression version are the natural logarithms of the errors of the nonlinear least squares version. It can not be the case that both the error and its logarithm are Normally distributed (of course, neither may be), so one or the other may be better on the basis of which has errors closer to being Normally distributed.

Related Solutions

Solved – How to assess multilevel model assumptions using residual plots

A multilevel model is defined as $y = Xβ + Zη + ǫ$

Thus there are 3 different kinds of residuals:

Marginal residuals: $y − Xβ\ (= Zη + ǫ)$
Conditional residuals: $y − Xβ − Zη\ (= ǫ)$
Random effects: $y − Xβ − ǫ\ (= Zη)$

Marginal residuals:

Should be mean 0, but may show grouping structure
May not be homoskedastic!
Good for checking fixed effects, just like linear regression.

Conditional residuals:

Should be mean zero with no grouping structure
Should be homoskedastic!
Good for checking normality of ǫ, outliers

Random effects:

Should be mean zero with no grouping structure
May not be be homoskedastic!
Good for checking normality of , outliers

In R (if results is an mer object), the command residuals(results) gives you the conditional residuals.

# checking the normality of conditional residuals:
qqnorm(resid(results), main="Q-Q plot for conditional residuals")

# checking the normality of the random effects (here random intercept):
qqnorm(ranef(resuls)$Name_of_group_variable$`(Intercept)`, 
       main="Q-Q plot for the random intercept")

The answer is partly copied from the following PowerPoint slide deck pdf.

Solved – Interpreting the residuals vs. fitted values plot for verifying the assumptions of a linear model

Below are those residual plots with the approximate mean and spread of points (limits that include most of the values) at each value of fitted (and hence of $x$) marked in - to a rough approximation indicating the conditional mean (red) and conditional mean $\pm$ (roughly!) twice the conditional standard deviation (purple):

diagnostic plots with approximate mean and spread at each value of fitted marked in

The second plot shows the mean residual doesn't change with the fitted values (and so is doesn't change with $x$), but the spread of the residuals (and hence of the $y$'s about the fitted line) is increasing as the fitted values (or $x$) changes. That is, the spread is not constant. Heteroskedasticity.
the third plot shows that the residuals are mostly negative when the fitted value is small, positive when the fitted value is in the middle and negative when the fitted value is large. That is, the spread is approximately constant, but the conditional mean is not - the fitted line doesn't describe how $y$ behaves as $x$ changes, since the relationship is curved.

Isn't it possible that it is linear, but that the errors are either not normally distributed, or else that they are normally distributed, but do not center around zero?

Not really*, in those situations the plots look different to the third plot.

(i) If the errors were normal but not centered at zero, but at $\theta$, say, then the intercept would pick up the mean error, and so the estimated intercept would be an estimate of $\beta_0+\theta$ (that would be its expected value, but it is estimated with error). Consequently, your residuals would still have conditional mean zero, and so the plot would look like the first plot above.

(ii) If the errors are not normally distributed the pattern of dots might be densest somewhere other than the center line (if the data were skewed), say, but the local mean residual would still be near 0.

non-normal errors

Here the purple lines still represent a (very) roughly 95% interval, but it's no longer symmetric. (I'm glossing over a couple of issues to avoid obscuring the basic point here.)

* It's not necessarily impossible -- if you have an "error" term that doesn't really behave like errors - say where $x$ and $y$ are related to them in just the right way - you might be able to produce patterns something like these. However, we make assumptions about the error term, such as that it's not related to $x$, for example, and has zero mean; we'd have to break at least some of those sorts of assumptions to do it. (In many cases you may have reason to conclude that such effects should be absent or at least relatively small.)

Best Answer

Related Solutions

Solved – How to assess multilevel model assumptions using residual plots

Solved – Interpreting the residuals vs. fitted values plot for verifying the assumptions of a linear model

Related Question