Solved – Accounting for both within subjects and between subjects mixed model

lme4-nlmemixed modelrepeated measures

I have an experiment where I have several subjects that I am analyzing a response for (call this RESPONSE). I am interested in the overall effect of temperature on RESPONSE. RESPONSE is measured once daily for each subject over the course of several weeks. Each subject also belongs to one of two levels of a factor (call this FACTOR). I want to know if the relationship between temperature and RESPONSE differs by factor.

This is longitudinal data measuring a response repeatedly through time on each individual subject. Therefore, I analyze this with a mixed model using lmer in lme4. The model specification looks like this…

Model <- lmer(RESPONSE ~ Temperature + FACTOR + doy + TemperatureFACTOR + FACTORdoy + (1 + doy | subject), data = dat, REML=TRUE)

In this model, doy is the day of the year to account for the fact that the effect is likely to vary through time due to processes occurring within the subject environment.
enter image description here
I am interested in the overall effect of temperature on the response. The way this model is set up, I believe it is looking at temperature within each subject only. The image above shows the relationship between temperature and response for one level of FACTOR. You can see that the overall relationship is positive, and a linear regression indicates a highly significant relationship. However, if you look within subjects (graph is color coded by subject, total of four subjects), the relationship is actually slightly negative. It is this slightly negative relationship that the model picks up on, reporting a negative coefficient for this level of FACTOR. I do understand these are not independent observations, so a linear regression is not appropriate. However, it still seems like this should be an overall positive relationship. Is there any way to specify the model so that it accounts for both the within subjects effect of temperature and the between subjects effect of temperature?

Best Answer

Your original model:

$Y_{si} = \beta_0 + S_{0s} + (β_{1} + S_{1s})X_{1si} + β_{2}X_{2si} + β_{3}X_{3si} + β_{4}X_{1si}X_{2si} + β_{5}X_{2si}X_{3si} + \epsilon_{si}$ where $s = 1,..., S$, indicates the subject, $i=1,..I_s$ indicates the measurement, $X_{1si}$ is day of year, $X_{2si}$ is factor and $X_{3si}$ = temperature, $\epsilon_{si} ~ N(0, σ^2)$ and $(S_{0s} S_{1s})'= N\left((0,0)', \left(\matrix{\sigma_1^2& \sigma_{12}\\ \sigma_{12}&\sigma_2^2}\right)\right)$. $\beta_0,...\beta_5$ are fixed effects.

For $X_{1si}$, it is 1 for Jan 1, xxxx, and 365 (or 366) for Dec 31, xxxx? If it is true, maybe periodic function is needed, or need to drop it, because the difference between means of $Y{si}$ at Jan 1, 2016 and Dec 31, 2015 is $365\beta_1$ and it may be not true.

I think your random slope should be on $X_{3si}$, instead of on $X_{1si}$ Maybe you can fit a model like this $Y_{si} = \beta_0 + S_{0s} + β_{1}X_{1si} + β_{2}X_{2si} + (β_{3}+S_{3s})X_{3si} + β_{4}X_{1si}X_{2si} + β_{5}X_{2si}X_{3si} + \epsilon_{si}$

Obviously, it is an exploratory analysis. You need to find the model that fit the data. My experience is fit several fixed effect models (linear models) with temperature alone and with other covariates, even the interactions. If you cannot find any model as you expect, maybe your theory is incorrect. If you find what you want, try to add the random effects in the model, such that the final model will be more reasonable.

In mixed model (in matrix),

$Y = X\beta + Z\gamma + \epsilon$, where $\gamma ~ N(0, G)$ and $\epsilon ~ N(0,R)$. For a given $X$, the variance-covariance of $Y$ is

$Var(Y) = ZGZ'+R$

Generally, we are not interesting in the random effect, instead we want to estimate the fixed effect $\beta$. The purpose of including random effect in the model is to make sure the model is more suitable to the real situation when the correlation exists among the response variable. If $Z$ has many columns with complicated structure, it is difficult to figure out what $ZGZ'$ looks like. It means you do not know what model you are fitting. Theoretically, you can have many continue variables in $Z$, but in practice, it is difficult to explain when you have two or more continue variables in $Z$.

Another method is get rid of random effect, and specify the variance-covariance matrix directly though $R$. When the variance-covariance structure is clear, this method is better than random effect.

In your case, if you think that temperature has effect on the correlation, for example, the two measurements from the same subject have higher correlation if the the temperatures are close, you can specify the $R$ though difference of the temperature, such as $\rho^{|t_i-t_j|}$.

Related Question