I have fitted the following linear mixed model based on the results of an economic game:
lmer(TotalScore~perOOgivenP+Game+(1|Subject),REML=T,data=mdl1table)->m1
TotalScore
is a integer.
perOOgivenP
is a proportion between 0 and 1 (most of which are 0).
Game is numeric and tells us if it is the 1st or 2nd game played by the participant
The QQ plot looks good so I am confident the residuals are normally distributed. The fitted vs. residuals plot does not look so good. To me it looks like the residuals are biased. I am not sure what Is causing this or what to do about it?
Could it be caused by the amount of 0's in perOOgivenP
(32/46 data points). perOOgivenP
is the proportion of times a particular behaviour was made. Would anyone suggest making this binary as in 0 or 1 (actual value not 0).
perOOgivenP is as follows:
[1] 0.500000 1.000000 0.333333 0.000000 0.000000 0.000000 NaN 0.166667 0.250000 0.800000
[11] 0.166667 0.000000 0.333333 0.000000 0.000000 NaN NaN 0.000000 0.000000 0.000000
[21] NaN 1.000000 NaN 0.000000 NaN 0.000000 0.000000 NaN NaN NaN
[31] NaN 0.000000 NaN 0.000000 0.000000 0.000000 NaN NaN NaN 1.000000
[41] NaN 0.000000 0.411765 0.000000 0.000000 NaN NaN 0.000000 0.200000 0.000000
[51] NaN 0.000000 0.333333 0.000000 0.250000 NaN 0.000000 NaN NaN NaN
[61] 0.000000 NaN 0.000000 0.000000 NaN 0.000000 NaN NaN NaN 0.000000
[71] 0.000000 0.000000 0.000000 NaN NaN NaN NaN NaN NaN NaN
Best Answer
By "biased" you presumably mean that there is some non-linear structure left in the residuals. It's by no means obvious to me that this is the case here.
Before leaping to any conclusions - and certainly before losing any information by transforming variables into a less-informative state - I would look at more residual plots. An obvious one would be to give a different colour or shape to the residuals where the suspect variable has a value of zero and see how this impacts on the residuals.