Solved – compare models from linear regression and nonlinear regression using RMSE

I am comparing multiple published equation forms, refit with independent data. I'm trying to be true to the original authors' methods as much as possible. Therefore, I have 3 linear equations (fit in R using lm()), two of which use transformed Y-variables, and one equation fit using nonlinear regression (fit in R using the gnls() function).

In all instances cases I'm weighting the residual variance structure using the inverse of one of the predictors to account for observed heteroskedasticity.

I have been evaluating the models using R2, and RMSE- using back-transformed data for the two models with transformations.

I've calculated RMSE "by hand" using the following equation:

 RMSE<-sqrt(sum(residuals(Equation)^2)/length(residuals(Equation))-2))

Should I use similar code to calculate RMSE for the linear and nonlinear regression models? Is the metric still a valid statistic for comparison, or am I missing some important assumption?

Edited: I initially stated that I was also comparing models using AIC; I later recalled that AIC would not be appropriate if the Y-variables were transformed because the models would be estimating different things.

Best Answer

RMSE is certainly appropriate also for nonlinear models
However, the RMSE expressions I know actually calculate the mean, so no -2 (looks like d.f. for linear model? - d.f. for nonlinear models would be different!)
In general, I'd not use the residuals for calculating RMSE but rather use independent test cases to avoid an optimistic bias.

Best Answer

Related Solutions

Solved – Nonlinear regression: best transformation when getting very different parameter estimates

Solved – RESET Test in R Influenced by Heteroskedasticity in the Data

Related Question