Generalized Linear Mixed Model – How Means Differ from Manual Calculation?

expected valuegeneralized linear modelmixed model

I've recently read a paper which used generalised linear mixed models to estimate mean annual and monthly values for the response variable in the model. The response variable was a normally distributed continuous variable and year and month were used as predictors in the model. The annual and monthly mean were then calculated for the response variable.

How is the annual and monthly mean estimated for the response variable in a generalised linear mixed model?
How would the mean calculated by a generalised linear model differ to a mean calculated by hand?

Best Answer

Here are the short answers to your questions:

The regression model allows you use the structure of the model to estimate the mean at predictor values by plugging the value into the fitted model equation. You can even do this for predictor values you didn't directly observe in the data set (more about this below).
The difference is that you're calculating the expected value (mean) conditional on the predictor values and random effects, which can be quite different from the mean unconditional on the predictor values (which I assume is what you mean by the means calculated by hand), if the predictors truly are important in predicting the response.

Here are the long answers to your questions:

Specifically, in the generalized linear mixed model you must be dealing with data that has some clustering in it, so let $Y_{ij}$ be the outcome variable for individual $i$ in cluster $j$ with corresponding predictor values ${\bf X}_{ij}$. The basic generalized linear mixed model with a random intercept is

$$ \varphi \left( E(Y_{ij}|{\bf X}_{ij}, \eta_j) \right) = \alpha + {\bf X}_{ij}{\boldsymbol \beta} + \eta_j $$

where $\eta_j \sim N(0,\sigma^{2}_{\eta})$ is the cluster $j$ random effect and $\varepsilon \sim N(0,\sigma^{2}_{\varepsilon})$ is the leftover error term and $\varphi(\cdot)$ is the link function. The link function is used to express the mean of the outcome variable as something that is linear in the predictors/random effects and is often used to change a bounded value (e.g. in (0,1)) to an unbounded range. The link function depends on the type of GLM. Some examples:

$\varphi(x) = x$ is the identity link, in which case you have a linear model (this appears to be the case in the example you've mentioned)
$\varphi(x) = \log \left( \frac{x}{1-x} \right)$ is the logistic link, used for logistic GLMMs.
$\varphi(x) = \log(x)$ is the log link, used for Poisson GLMMs.
$\varphi(x) = \Phi^{-1}(x)$ is the probit link, used for Probit GLMMS.

To answer question (1): Now, if we have estimates of $\alpha$ and $\boldsymbol \beta$, then we can estimate the conditional mean of $Y_{ij}$, given the random effects and the predictor values:

$$ \widehat{E}(Y_{ij} | {\boldsymbol X}_{ij},\eta_j) = \varphi^{-1}(\hat \alpha + {\bf X}_{ij}\hat {\boldsymbol \beta} + \eta_j)$$

In your example, the times are covariates (i.e. ${\bf X}_{ij}$s) in the model which are plugged into the regression equation to estimate the monthly means.

One useful thing this allows you to do is to estimate the mean for a predictor value you didn't directly observe. That is,

the (linear) structure imposed by the model allows you to estimate $E(Y|X)$ for values of $X$ you didn't observe, by linear interpolation - when the linear model is a good fit this is a very nice bonus, since you wouldn't be able to calculate this quantity by hand without the model.
Note: One should exercise extreme caution when plugging in predictor values that go outside of the range of the data used to fit the model. This is called extrapolation and can perform arbitrarily poorly if the model does not apply outside of the range of the observed data.

The equation above can also be used to estimate relative changes in an individual's average outcome for different predictor values. Often times this relative change doesn't even depend on the unobserved random effect. For example, when you use the log link:

$$\frac{ \widehat{E}(Y_{ij} | {\boldsymbol X}_{ij},\eta_j) } { \widehat{E}(Y_{ij} | {\boldsymbol X}'_{ij},\eta_j) } = \frac{ \exp (\hat \alpha + {\bf X}_{ij}\hat {\boldsymbol \beta} + \eta_j)}{ \exp (\hat \alpha + {\bf X}'_{ij}\hat {\boldsymbol \beta} + \eta_j)} = \exp \left( ({\bf X}_{ij}-{\bf X}'_{ij})\hat {\boldsymbol \beta} \right) $$

Note: The mean unconditional on the random effect is the average over the random effects:

$$ E(Y_{ij} | {\boldsymbol X}_{ij}) = E_{\eta} \left( E(Y_{ij} | {\boldsymbol X}_{ij},\eta_j) \right) $$

which is not trivially related to the conditional expectation (since it is an intergral of the conditional expectation) except the identity link is used (i.e. you have a linear model). Some remarks about the mean unconditional on the random effects:

In a linear mixed effects model, the mean unconditional on the random effects is the same as the conditional mean - this is fairly clear when you plug in the identity link and use the fact that the random effects have mean 0. This fact does not always hold in the case with non-linear models (see here for more discussion).
The change in the conditional mean as a function of the predictors (but unconditional on the random effects) is related to the average change in the population for a change in the predictor value.
The change in the condition mean when the random effects are conditioned on is related to the change in the expected value for a particular individual for a change in the predictor value. More on this can be read about in the link above.

To answer question (2): When you calculate the sample mean you're effectively marginalizing over the predictor values and throwing out the information they provide. That is, assuming your ${\bf X}$s are a random sample (that is, your data are not from a retrospective study or something else like that), then the sample mean estimates

$$ E(Y_{ij}) = E_{\bf X} \left( E(Y_{ij} | {\bf X}_{ij} ) \right) $$

If ${\bf X}_{ij}$ truly does have an effect, then $E(Y_{ij} | {\bf X}_{ij} )$ can be very different from $E(Y_{ij})$ and that's the explanation for what you're seeing.

Note that if you have categorical predictors, then you can estimate the conditional means by stratifying the data set and calculating the mean within each strata, but this is exactly what the regression model does. So, in any case, you're no worse off using the regression estimates of the mean, as long as the model fits reasonably well.

Best Answer

Related Solutions

Solved – How to cope with multicollinearity and interactions between IVs in generalized linear models

Solved – How to conduct predictor selection in a generalized linear mixed model

Related Question