Your original model:
$Y_{si} = \beta_0 + S_{0s} + (β_{1} + S_{1s})X_{1si} + β_{2}X_{2si} + β_{3}X_{3si} + β_{4}X_{1si}X_{2si} + β_{5}X_{2si}X_{3si} + \epsilon_{si}$ where $s = 1,..., S$, indicates the subject, $i=1,..I_s$ indicates the measurement, $X_{1si}$ is day of year, $X_{2si}$ is factor and $X_{3si}$ = temperature, $\epsilon_{si} ~ N(0, σ^2)$
and $(S_{0s} S_{1s})'= N\left((0,0)', \left(\matrix{\sigma_1^2& \sigma_{12}\\ \sigma_{12}&\sigma_2^2}\right)\right)$. $\beta_0,...\beta_5$ are fixed effects.
For $X_{1si}$, it is 1 for Jan 1, xxxx, and 365 (or 366) for Dec 31, xxxx? If it is true, maybe periodic function is needed, or need to drop it, because the difference between means of $Y{si}$ at Jan 1, 2016 and Dec 31, 2015 is $365\beta_1$ and it may be not true.
I think your random slope should be on $X_{3si}$, instead of on $X_{1si}$
Maybe you can fit a model like this
$Y_{si} = \beta_0 + S_{0s} + β_{1}X_{1si} + β_{2}X_{2si} + (β_{3}+S_{3s})X_{3si} + β_{4}X_{1si}X_{2si} + β_{5}X_{2si}X_{3si} + \epsilon_{si}$
Obviously, it is an exploratory analysis. You need to find the model that fit the data. My experience is fit several fixed effect models (linear models) with temperature alone and with other covariates, even the interactions. If you cannot find any model as you expect, maybe your theory is incorrect. If you find what you want, try to add the random effects in the model, such that the final model will be more reasonable.
In mixed model (in matrix),
$Y = X\beta + Z\gamma + \epsilon$, where $\gamma ~ N(0, G)$ and $\epsilon ~ N(0,R)$.
For a given $X$, the variance-covariance of $Y$ is
$Var(Y) = ZGZ'+R$
Generally, we are not interesting in the random effect, instead we want to estimate the fixed effect $\beta$. The purpose of including random effect in the model is to make sure the model is more suitable to the real situation when the correlation exists among the response variable. If $Z$ has many columns with complicated structure, it is difficult to figure out what $ZGZ'$ looks like. It means you do not know what model you are fitting. Theoretically, you can have many continue variables in $Z$, but in practice, it is difficult to explain when you have two or more continue variables in $Z$.
Another method is get rid of random effect, and specify the variance-covariance matrix directly though $R$. When the variance-covariance structure is clear, this method is better than random effect.
In your case, if you think that temperature has effect on the correlation, for example, the two measurements from the same subject have higher correlation if the the temperatures are close, you can specify the $R$ though difference of the temperature, such as $\rho^{|t_i-t_j|}$.
Using Jake Westfall's EMS
function documented HERE, I estimated the variance components from the mixed model. The random variances above that were estimated to be 0 are indeed negative as derived by EMS.
cbind(rev(ans[c(grep("e",names(ans)),grep("s",names(ans)))])/
c(1,2,2,2,4,4,4,5,1))
s 69.838600
a:s 7.711213
b:s -1.423696
c:s -1.098723
a:b:s -1.498620
a:c:s -1.212590
b:c:s -1.244919
a:b:c:s -2.650683
e 922.447967
When running lmer.4
on the aggregated data-set, lmer
gives identical results (except for the 3-way interaction, but it is still very similar).
> lmer.4 <- lmer(rating ~ timepoint_n*att_cond_n*gng_cond_n +(1|subjectID) + (0 + timepoint_n | subjectID)
(0 + att_cond_n | subjectID) + (0+gng_cond_n |subjectID) + (0+timepoint_n:att_cond_n|subjectID) +
(0+timepoint_n:gng_cond_n|subjectID) + (0 + att_cond_n:gng_cond_n|subjectID),
data = d_aggr, control=lmerControl(optCtrl=list(maxfun=1e9)))
> summary(lmer.4)
Linear mixed model fit by REML
t-tests use Satterthwaite approximations to degrees of freedom ['lmerMod']
Formula: rating ~ timepoint_n * att_cond_n * gng_cond_n + (1 | subjectID) +
(0 + timepoint_n | subjectID) + (0 + att_cond_n | subjectID) +
(0 + gng_cond_n | subjectID) + (0 + timepoint_n:att_cond_n |
subjectID) + (0 + timepoint_n:gng_cond_n | subjectID) + (0 + att_cond_n:gng_cond_n | subjectID)
Data: d_aggr
Control: lmerControl(optCtrl = list(maxfun = 1e+09))
REML criterion at convergence: 2438.7
Scaled residuals:
Min 1Q Median 3Q Max
-4.0726 -0.1788 -0.0138 0.2457 4.3407
Random effects:
Groups Name Variance Std.Dev.
subjectID (Intercept) 2.850e+02 1.688e+01
subjectID.1 timepoint_n 3.651e+01 6.042e+00
subjectID.2 att_cond_n 2.257e-11 4.750e-06
subjectID.3 gng_cond_n 1.269e+00 1.126e+00
subjectID.4 timepoint_n:att_cond_n 0.000e+00 0.000e+00
subjectID.5 timepoint_n:gng_cond_n 8.132e-01 9.018e-01
subjectID.6 att_cond_n:gng_cond_n 6.839e-01 8.270e-01
Residual 4.694e+01 6.851e+00
Number of obs: 328, groups: subjectID, 41
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 140.88293 2.66360 40.00000 52.892 < 2e-16 ***
timepoint_n -6.92561 1.01664 40.00000 -6.812 3.43e-08 ***
att_cond_n -0.13354 0.37828 120.00000 -0.353 0.72470
gng_cond_n 1.35854 0.41718 40.00000 3.256 0.00230 **
timepoint_n:att_cond_n -0.02744 0.37828 120.00000 -0.073 0.94230
timepoint_n:gng_cond_n 1.36220 0.40365 40.00000 3.375 0.00165 **
att_cond_n:gng_cond_n 1.00549 0.39972 40.00000 2.515 0.01601 *
timepoint_n:att_cond_n:gng_cond_n 0.98963 0.37828 120.00000 2.616 0.01004 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Correlation of Fixed Effects:
(Intr) tmpnt_ att_c_ gng_c_ tmpnt_n:t__ tmpnt_n:g__ a__:__
timepoint_n 0.000
att_cond_n 0.000 0.000
gng_cond_n 0.000 0.000 0.000
tmpnt_n:t__ 0.000 0.000 0.000 0.000
tmpnt_n:g__ 0.000 0.000 0.000 0.000 0.000
att_cnd_:__ 0.000 0.000 0.000 0.000 0.000 0.000
tmpn_:__:__ 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Therefore, it seems that as lmer
cannot achieve the fit of aov
by not being able/allowed to estimate negative variance components, the results differ, while the difference is gone when running the lmer
representation of the model on the aggregated data, eliminating the negatively estimated variance components.
Best Answer
At first sight, there are two main differences:
(year|cluster)
implicitly includes the intercept and thus expands to(1 + year|cluster)
. In this case you assume a different baseline for each cluster and allow the clusters to vary with respect to the year effect.Note that mod1 estimates $k$ variances and $k(k-1)/2$ correlations for the $k$ random effects per cluster. If you constrain the cluster-related variances to be equal and all the correlations also to be equal (this is called compound symmetry), you get a model with many fewer (only two) variance/covariance parameters:
(note the similarity/difference compared to mod2) which can be useful when there is not enough information in the data to estimate mod1. For more on this see this excellent post by Reinhold Kliegl and also my follow-up question: Equivalence of (0 + factor|group) and (1|group) + (1|group:factor) random effect specifications in case of compound symmetry.
So mod3 can be seen as a restricted version of mod1. Your mod2 is restricted even further, but this one-parameter parametrization is arguably less realistic.