Outliers – How to Identify Outliers and Perform Model Diagnostics in lme4 Model

diagnosticleveragelme4-nlmemixed modeloutliers

I need to identify outliers and high leverage points, and perform model diagnostics, in an lme4 model. For outliers and high leverage points, simply making a plot to visually inspect would be nice, but is insufficient. I have 10,800 data points, and need to mark each point via some analytic or computational test as either outlier/high-leverage or not outlier/high-leverage. After identification of outliers/high-leverage points, I will go through a separate process to decide whether or not to exclude the points from the data set.

Exclusion of points will take into consideration prior detailed analysis of each observation's raw data source (an audio recording), in addition to the automated identification mentioned above. Here, I refer to this process as "selective deletion".

I also need to understand if my outliers should be based on "marginal" or "conditional" residuals, and if my leverage should be based on fixed or fixed plus random effects. For defintions of "marginal" and "conditional", as well as potential defintions of leverage, I am following Diagnostic and Treatment for Linear Mixed Models, Singer et al, 2013.

I.e., with a mixed model of the form…
$$
y= X\beta + Zb + e
$$
$$
e \thicksim N(0,\sigma^2I)
$$
$$
b \thicksim N(0,G)
$$
Where $G$ is a symmetric, positive definite matrix. By marginal, I mean residuals of the form:
$$
\zeta = y-E[y]=y – X\beta
$$
By conditional, I means residuals of the form:
$$
e = y – X\beta – Zb
$$
My questions are:

How to identify outliers via an automated procedure based on an lme4 model.
Whether marginal or conditional residuals should be used to identify candidates for selective deletion.
What kind of residuals should be used for assessment of normality, linearity, homoscedasticity, etc.
How to identify high leverage points for the purposes of selective deletion, and whether to use leverage from fixed or all effects (see Singer et al, above).
How to test that $b$ is distributed as $N(0,G)$, i.e., general multi-variate normal? Is this done simply by looking at QQ plots of the random effects? What if $G$ has covariances, i.e., non-zero off-diagonal terms? Is looking at 1-dimensional QQ plots for each random effect still adequate to evaluate this type of normality? Or is some sort of transformation required?

Best Answer

(This started out as a comment but seemed to be getting too long.)

This question may be getting less attention than it would otherwise deserve because it is very broad (among other things, you've asked 5 separate questions here). A few answers:

Conditional and marginal residuals just mean different things, I'm not sure there is a "right answer" here -- you would just be asking about different kinds of outlierishness/leverage. In general it would seem that conditional residuals (i.e. re.form=NULL, or the default, in lme4) would make more sense.
Note that many of the influence measures you get (e.g. by hatvalues.merMod(), see below) will be conditional on the estimated variance-covariance matrices of the random effects; this is different from the question of whether you're conditioning on conditional modes/BLUPs or not. If you don't want to condition on these estimates, you'll have to (1) assume multivariate Normality of the estimates of the variance-covariance parameters (ugh) or (2) do some kind of parametric bootstrapping (double-ugh).
Many of the standard influence measures are more difficult for (G)LMMs if they involve inverting large matrices -- that's not always practical. The influence.ME package does a lot of its work by a semi-brute force method:

the influence() function iteratively modifies the mixed effects model to neutralize the effect a grouped set of data has on the parameters, and which returns returns [sic] the fixed parameters of these iteratively modified models.

Note also the difference between influential observations and influential groups, either of which might be of interest.

The lme4 package does provide a hat matrix (or its diagonal) via ?hatvalues.merMod, so you could use these to compute some standard influence measures.
As far as marginal Q-Q plots for the BLUPs/conditional modes go: if the BLUPs/conditional modes are multivariate Normal, then the univariate distributions will be too. The contrapositive holds (if the univariate distributions are bad, then the multivariate distribution is bad), but not necessarily the converse (if the univariate distributions look good, the multivariate distribution might still be bad), but IMO you'd have to work pretty hard to construct such an example.
There are formal tests for the misspecification of random effects, e.g. Abad et al. Biostatistics 2010 (see full citation below). Don't know offhand where it has been implemented.
Finally, it seems that a lot of what you want has already been discussed in the conference paper that you linked (ref below). Why not just draw the plots they suggest and pick a cutoff (e.g. $\pm 1.96 \sigma$) to identify outliers from them?

Abad, Ariel Alonso, Saskia Litière, and Geert Molenberghs. “Testing for Misspecification in Generalized Linear Mixed Models.” Biostatistics 11, no. 4 (October 1, 2010): 771–86. doi:10.1093/biostatistics/kxq019.

Julio M. Singer, Juvencio S. Nobre, and Francisco M. M. Rocha. “Diagnostic and Treatment for Linear Mixed Models,” 5486. Hong Kong, 2013. http://2013.isiproceedings.org/Files/CPS203-P28-S.pdf.

Best Answer

Related Solutions

Solved – Is this an acceptable way to analyse mixed effect models with lme4 in R

Some background reading and useful packages

Goodness of fit and comparing models

Multiple comparisons

Solved – Cook’s distance cut-off value

Related Question