Regression – Understanding Linear Mixed Effects Model and Random Effects with Weights in R

mixed modelregression

Could anybody please shed some light on me? I have $N$ individuals, divided into $K$ groups with $N>K$. Some groups have 1 individuals only, some other groups have more individuals.

Each individual has 4 features (4 variables): $F_0, F_1, F_2,$ and $F_3$, where $F_0$ is simply the group indicator. Therefore the data matrix is of size $N \times 4$. There is also a weight vector $w$ of length $N$, which gives the weights for each of the $N$ individuals in the regression.

May I ask if the following model is a random-intercept model?

  1. There is a common beta for all N individuals.
  2. Each group has a different within group regression line (same slope but different intercepts).
  3. The regression line within each group crosses the "cloud" consisting of the group members. And the individual residuals scatter around the regression line, within each group.

This sounds like a "random-intercept" model to me. However, how do I explicitly write out the equation?

$$ {\boldsymbol y} = {\bf X} \beta + {\bf Z} b + \varepsilon $$

More specifically, with the three feature variables $F_1, F_2$ and $F_3$ and the group indicator variable $F_0$. I am having difficulty writing out $X$ and $Z$ explicitly. Moreover, my $F_2$ is a factor variable.

Could anybody please show me how the $X$ and $Z$ matrices look like explicitly? And what do the "$b$"'s represent here?

And how do I set up the weights in LME in R? If I would like to have "group"-weights and "individual"-weights, how shall I do it?

Thank you!

[ps1 @ Macro]
Thanks a lot Macro for your very comprehensive answer. I am continuing digesting your writings and I will consult with you and your expertise with more questions. My first question is: you think the condition 2 and condition 3 conflict each other.

I don't understand why. Here is my thinking: even though for a certain group, the overall beta is a very crappy one for that group.

By adjusting the intercept for that group, I at least still can get the regression line passing the cloud, am I right?

More specifically, in a random intercept model (all group sharing the same beta), will the following situation arise?

All data points in a group reside on one side of the regression line of that group?


In a random intercept model, for each group, which point does the regression line pass thru? In the general OLS, we knew that it's the
point (xbar, ybar) that is passed thru by the regression line.

But for each group in a random intercept model, which point is that "central" point?


Moreover, in R, what's a convenient way to visualize what's happening within a certain group in a mixed model?

Thanks a lot for your help!

Best Answer

I'll answer each of your questions one at a time.

May I ask if the following model is a random-intercept model?

1. There is a common beta for all N individuals.

2. Each group has a different within group regression line (same slope but different intercepts).

3. The regression line within each group crosses the "cloud" consisting of the group members. And the individual residuals scatter around the regression line, within each group.

Conditions (2) and (3) are in conflict with each other. If each group has the same slope but different intercepts then the within group regression line will not pass through the cloud of group observations unless the truth is that every group has the exact same slope. You would need both a random intercept and a random slope in every predictor to guarantee that condition (3) is satisfied.

However, how do I explicitly write out the equation?

The familiar formula for the random effects model, as you pointed out, is

$$ {\bf Y}_i = {\bf X}_i {\boldsymbol \beta} + {\bf Z}_{i} {\bf b}_i + {\boldsymbol \varepsilon}_{i} $$

Where $${\bf Y}_i = \left( \begin{array}{c} y_{i1} \\ y_{i2} \\ \vdots \\ y_{i n_{i}} \end{array} \right) $$ is the vector of responses in group $i$, $n_{i}$ is the number of observations in group $i$ $(n_i$ can be $1$ for some groups but not all groups $)$.

$${\bf X}_i = \left( \begin{array}{ccccc} 1 & x_{i11} & x_{i12} & \cdots & x_{i1p} \\ 1 & x_{i21} & x_{i22} & \cdots & x_{i2p} \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & x_{i n_{i} 1} & x_{i n_{i} 2} & \cdots & x_{i n_{i} p} \\ \end{array} \right) $$

is the matrix of the $p$ predictor variables for each observation in group $i$ with corresponding $p$-length fixed effects regression coefficient vector ${\boldsymbol \beta}$. $${\bf b}_i = \left( \begin{array}{c} b_{i0} \\ b_{i1} \\ \vdots \\ b_{im} \end{array} \right) $$ is the $m$ length vector the vector of random effects and $${\bf Z}_i = \left( \begin{array}{cccc} z_{i11} & z_{i12} & \cdots & z_{i1m} \\ z_{i21} & z_{i22} & \cdots & z_{i2m} \\ \vdots & \vdots & \vdots & \vdots \\ z_{in_{i} 1} & z_{i n_{i} 2} & \cdots & z_{i n_{i} m} \\ \end{array} \right) $$ be the random effects design matrix for group $i$ and

$$ {\boldsymbol \varepsilon}_i = \left( \begin{array}{c} \varepsilon_{i1} \\ \varepsilon_{i2} \\ \vdots \\ \varepsilon_{i n_i} \end{array} \right) $$ is the vector of errors. So, for example if you just had a random intercept and a random slope in the first predictor, then

$${\bf b}_i = \left( \begin{array}{c} b_{i0} \\ b_{i1} \end{array} \right), {\bf Z}_i = \left( \begin{array}{cc} 1 & x_{i11} \\ 1 & x_{i21} \\ \vdots & \vdots \\ 1 & x_{i n_{i} 2} \\ \end{array} \right) $$

where $b_{i0}$ is the random intercept and $b_{i1}$ is the random slope. If you only had a random intercept and nothing else then ${\bf b}_i$ would be a scalar and ${\bf Z}_{i}$ would just be a vector of $1$s.

In your particular example, you have a categorical predictor (say with, $k$ levels), which, for modeling, is effectively like having $k-1$ dummy variables which are $1$ if variable takes on that value and $0$ otherwise. So, your ${\bf X}_{i}$ matrix will have $k+2$ columns - one column of $1$s, two columns with the values of the quantitative predictors, and $k-1$ columns with that are $0/1$ indicators of which level the categorical predictor takes. If you are going to include random slopes in every predictor, then ${\bf Z}_{i}$ will be exactly the same as ${\bf X}_{i}$. As mentioned above, if you only want a random intercept in the model then ${\bf Z}_{i}$ is just a column of $1$s.

how do I set up the weights in LME in R?

This depends on what you mean by "weights". Usually this means that certain observations are weighted (e.g. inverse-probability weights to correct for unequal probability sampling) and the criterion function that is being optimized to produce your estimates (probably the likelihood function in this case) is a weighted. For example, if the groups were sampled with unequal probability the function

$$ {\bf L} = \sum_{i=1}^{K} w_i L_i $$

where $L_i$ is the group $i$ log-likelihood may be optimized instead of the the unweighted sum. In terms of point estimation, this is effectively the same as as replicating group $i$ in the data set a number of times equal to $w_i$, which can be accomplished by doing exactly that - expand the data set based on the weights and fit the model to this expanded data set. I'm not sure if there is functionality in lme to do this automatically, so you may need to do this yourself.

Regarding weighting within a group (i.e. at the individual level), I do not recommend this in the context of random effects modeling. To see why, consider the fact that by weighting within a cluster, you're effectively creating exact copies of individuals within the cluster. Therefore, there will be pockets within the group that are perfectly correlated with each other, so the estimates of the random effect variances will be biased up, since the model will think that members of a group are more correlated than they are.

Comments related to the Edit:

  1. The only way adding a random intercept would make it so that each group's regression line passes through the "cloud" is if each group's "cloud" was just a vertical shift of each other's group - that is, the slopes are exactly the same but the intercepts are different. More generally, the linear least squares line requires both a slope and an intercept. If you don't let the slopes vary,the random intercept will go wherever is "best" (i.e. maximizes the posterior mode, if you're trying to estimate random effects),so I don't think it could possibly appear all the way on one side or another of the "cloud".

  2. The central point within each group would presumably be the sample means, within the group, although this would require some more thought, as would the comments made in (1), since we aren't fitting this model by least squares (although it is closely related to least squares, since it involves the Gaussian likelihood).

  3. The only way I can see to visualize this is to plot the points for a given group, along with the fitted line within the group, the same way you would with ordinary regression. You can extract posterior estimates of the random effects using the ranef function with the argument being the lmer model fit and you can extract the fixed effects in the usual way.

Related Question