The issue is complicated because your model is logistic. Under normal circumstances such as in a linear regression, most things you say would apply. Focusing on the linear model, I say most because adding variables should not increase random intercept variance even if the variables are mediocre predictors. The random intercept variance can go up very slightly but it shouldn't be by much. But with logistic regression, the case is not necessarily so.
I'll make some claims then explain why at the end.
If you add variables that explain the outcome better to a multilevel logistic regression, the variance of the random intercept will increase. However, if that variable also accounts for the differences between the embarks, then the random intercept variance may decrease.
If that variable in no way accounts for any differences between embarks but explains the outcome better, the variance of the random intercept will definitely go up. An example is a variable that you have centered using the mean of each embark on that variable such that it doesn't vary across embarks.
This is because the error variance is fixed to $\pi^2/3$, such that any improvements to the model will reflect in increased random intercept variance, unless such improvements simultaneously explain differences between embarks thus reducing the random intercept variance. I hope this makes some sense.
Replying to your comments about the ICC. Since you are using R, check out the MuMIn package which has an R-squared glmm function. This should allow you to calculate $R^2$ as defined by Nakagawa and Schielzeth http://dx.doi.org/10.1111/j.2041-210x.2012.00261.x Theirs is a relatively simple approach that takes the different sources of variance into consideration (fixed effects, random effects, logistic error) so that one can compare across models with varying fixed effects.
Because of this structure, I don’t think I can just do a usual Cox Proportional Hazard model (like coxph()
).
That's not necessarily true, and if it is true it might not be for the reason you think. The fact that, for technical reasons, many individuals have the same values for areas and concentrations doesn't by itself rule out a simple coxph()
analysis. The problem is that survival among individuals in one nest or sample might differ systematically from survival in another nest or sample under otherwise identical conditions. You need to take such correlations among associated individuals into account.
If you only have "a few" different nests, you might be able to treat nests as fixed effects. Then you could use the standard coxph()
function, with a cluster
variable handling the correlations within samples. For anything more elaborate, coxme()
is a reasonable choice.
I’ve read that, as a general rule of thumb, random effects + many predictor variables = less accurate coefficient estimates.
This depends on the details. If there are more than a few groups, modeling with random effects typically poses less of a problem in this respect than does modeling with fixed effects. With random effects you are mainly modeling the variance among the effects rather than the individual effects themselves, reducing the number of parameter values you have to estimate from the data.
The number of events is typically the major limiting factor for reliably fitting a survival model. The usual rule of thumb is to have about 15 events per coefficient that you are estimating in the model to avoid overfitting.
You have surface area and 14 compounds* that you would like to evaluate as fixed effects, along with some variances among random effects. So if you have something like 300 or more events, you probably would be OK with random effects for samples and nests, and fixed effects for all your compounds and surface area. Yes, the variance of coefficient estimates does increase as you try to estimate more coefficients and thus reduce the residual degrees of freedom, but if you have a large enough event/coefficient ratio to avoid overfitting that shouldn't be a big problem.
I figured, if a model containing random effects + chemical data is better than a model of just random effects, then the random effects are not critical to consider. Therefore, we can thus examine a model of just the chemical data ...
As Robert Long's comment implies, that would not be "following the rules." One way or another, you need to take into account the lack of independence among observations that you have with this type of structure. One can discuss whether that is best done with nests treated as fixed versus random effects, or whether random effects in a mixed model versus clusters in a standard Cox model are the best ways to handle samples. But you have to take the correlations among observations into account somehow.
*A couple of cautions about the chemical-concentration analysis. First, it's not clear when the concentrations are measured. For survival analysis those concentrations should be those present at the event times. If, for example, concentrations are measured at the end of the experiment, then it might be that the deaths are leading to the concentration changes rather than vice-versa. Second, if you have fewer nests than the 14 compounds you want to evaluate, then including all the compounds along with the nests as predictors will lead to a linear dependence that might pose problems with random-effect modeling of nests and almost certainly will be an issue if nests are treated as fixed effects. If you've been having problems with coxme()
when trying to include all 14 compounds, that (not some inherent problem with the number of fixed-effect predictors in a mixed-effects model) might be why. You might consider a ridge-regression type of penalization on the concentrations to get around that.
Best Answer
To summarize the useful information provided in comments: you don't necessarily have to use a mixed model, but you do have to take the lack of independence among the measurements into account in some way. Robust standard errors and generalized estimating equations (GEE), as discussed by Mc Neish et al. (linked from a comment by @mkt), and generalized least squares are alternatives in many situations. Your situation, with trees nested within transects nested within shelterbelts, would seem best represented by a multi-level model.
The inability to distinguish any particular random-effect estimate from 0 in a mixed model isn't important. More important is the variance among those random-effect values, which are modeled as coming from a Gaussian distribution. For example, consider this adaptation of a simple example from the
lmer
help page:Notice that the random effects are reported first, with a standard deviation for the random intercept. In this example, that standard deviation is about the same magnitude as the residual standard deviation. The variance (square of the standard deviation) among those random-effect intercept values is what the model primarily estimates. That's what you should primarily pay attention to, in terms of "random effects" being "different from zero" (Question 1). Yes, you can extract the individual random-intercept estimate from the model, but the fact that you can't distinguish any individual values from 0 doesn't mean that the random effect as a whole is unimportant (Question 2).
It's possible to have such low dispersion among the individual random-effect values that your model can't reliably estimate their variance and you end up with a singular model fit. That does not, however, appear to be the case with your data. As described on this page, linked from a comment by @dipetkov, that's typically seen when you try to fit more combinations of random effects than your data allow. How to proceed in such a case requires evaluation of what in particular is leading to the problem. The lack of independence among correlated observations still needs to be taken into account. Otherwise you do suffer from "pseudoreplication" (Question 3).