Solved – Fraction of variance unexplained and R-squared in linear and non-linear regression

linear modelnonlinear regressionrregression

I have a non-linear model of the following form:

$y = a*x^b$

I can fit it using logarithms and a linear model or directly with a non-linear model.

First approach, logarithms and linear model:

lmfit <- lm(log(y)~log(x))

Second approach, non-linear model:

nlsfit <- nls(y~a*x^b, start=list(a=200, b=1.6))

In the first case I can simply get the $R^2$ value from the linear model or calculate it myself by:

rsquared <- var(fitted(lmfit)) / var(log(y))

In the second case there is no $R^2$ value generated, but I can obtain one $pseudoR^2$ value myself by:

pseudorsquared <- var(fitted(nlsfit)) / var(y)

In a linear model I can calculate the fraction of variance unexplained by simply doing $1-R^2$. I have read that this is not applicable to non-linear regressions. I would like to know if there is an equivalent version of this measure, so that I can compare both regressions and use the best one.

As an extra information, I would like to add that this is a regression of physical variables, and that the non-linear approach is providing more close-to-literature results for the coeficients, whereas the linear approach gives better statistical performance ($R^2$, bias, etc.).

Best Answer

What needs greater exposure here is that the different methods have quite different assumptions about error structure so that from one (perhaps conservative) viewpoint the best method is that which makes the most accurate assumptions about functional form and error structure and it follows that any other method is just not so good.

You have not mentioned a third way which is quite easy to implement, to use generalized linear models for y as a function of log x, with log link. Someone else will be easily be able to give R code for that if it is not evident.

Or a fourth way (which in turn can be done differently): to assume that both y and x are subject to error.

In terms of $R^2$ measures, my own preference here is to regard them as variations on squaring corr(observed $y$, predicted $y$), the variations coming in how $y$ is predicted. There remains a difference between any fitting procedure that can be presented as (directly equivalent to) maximizing such an $R^2$ and any where it is a descriptive figure of merit calculated post hoc.

But what do you really seek in such measures? It seems to me more fruitful to look at the structure of residuals using a graphical approach. On this criterion the best model leaves least structure in the residuals.

The power function appears as a favourite model in several literatures that don't overlap very much, although the many of the same problems and many of the same solutions have been rediscovered in different fields. Less satisfactorily, systematic vagueness on how parameters were estimated is also common in various disciplines.

Consistency with literature estimates might well depend on consistency with the dominant method in your field.

Related Question