Solved – Troublesome residual plot from linear mixed model

diagnosticlinear modelmixed modelregressionresiduals

I have fitted the following linear mixed model based on the results of an economic game:

lmer(TotalScore~perOOgivenP+Game+(1|Subject),REML=T,data=mdl1table)->m1

TotalScore is a integer.
perOOgivenP is a proportion between 0 and 1 (most of which are 0).
Game is numeric and tells us if it is the 1st or 2nd game played by the participant

The QQ plot looks good so I am confident the residuals are normally distributed. The fitted vs. residuals plot does not look so good. To me it looks like the residuals are biased. I am not sure what Is causing this or what to do about it?

Could it be caused by the amount of 0's in perOOgivenP (32/46 data points). perOOgivenP is the proportion of times a particular behaviour was made. Would anyone suggest making this binary as in 0 or 1 (actual value not 0).

perOOgivenP is as follows:

 [1] 0.500000 1.000000 0.333333 0.000000 0.000000 0.000000      NaN 0.166667 0.250000 0.800000
[11] 0.166667 0.000000 0.333333 0.000000 0.000000      NaN      NaN 0.000000 0.000000 0.000000
[21]      NaN 1.000000      NaN 0.000000      NaN 0.000000 0.000000      NaN      NaN      NaN
[31]      NaN 0.000000      NaN 0.000000 0.000000 0.000000      NaN      NaN      NaN 1.000000
[41]      NaN 0.000000 0.411765 0.000000 0.000000      NaN      NaN 0.000000 0.200000 0.000000
[51]      NaN 0.000000 0.333333 0.000000 0.250000      NaN 0.000000      NaN      NaN      NaN
[61] 0.000000      NaN 0.000000 0.000000      NaN 0.000000      NaN      NaN      NaN 0.000000
[71] 0.000000 0.000000 0.000000      NaN      NaN      NaN      NaN      NaN      NaN      NaN

Fitted vs. residuals plot

Best Answer

By "biased" you presumably mean that there is some non-linear structure left in the residuals. It's by no means obvious to me that this is the case here.

Before leaping to any conclusions - and certainly before losing any information by transforming variables into a less-informative state - I would look at more residual plots. An obvious one would be to give a different colour or shape to the residuals where the suspect variable has a value of zero and see how this impacts on the residuals.