Solved – Interpretation of residual plot for mixed effect linear model

linearitylme4-nlmerresiduals

I am trying to do mixed effect regression model with lme4.lmer function. The model has two fixed categorical variables, one random categorical variable and response is continuous variable (distance in meters). I am just puzzled with this residual plot.

enter image description here

I would like to ask:

1.) may I assume from this plot that the model is linear?

2.) there is heteroskedasticity, since variability in residual for smaller values is lower than for bigger values?

3.) According to points 1 and 2, can I use this method, or should I make some log transformation on response and check if this plot looks better, or should I use som other type of model (which?).

Best Answer

It looks like linear, because the mean of the residuals seems to be close to 0 for each level of the predicted values. If you want to make a more rigorous test, one way to go could be to add as predictors powers of the fitted values, and test whether their coefficients are different from zero (Ramsey RESET test).

However you are right about the heteroscedasticity: the variance in the residuals increase with the fitted values, which is a problem. Also, the distribution seems slightly asymmetrical, with a slightly heavier tail for positive residuals. One better way to check visually for the asymmetry is the normal quantile-quantile plot (function qqnorm in R).

Given the heteroscedasticity and asymmetry of the residuals, I would suggest to transform your data. You could try taking the log or the reciprocal of the dependent variable, fit the model and check again the residuals. As an alternative you could use also the Box-Cox transformation, which usually gives the best results. You can use the boxcox function of the MASS package to find the optimal value of lambda for your data. (Have a look also at this tutorial on R-bloggers)