I have repeated measurements of individuals, like this
y | id | time | V1 | V2
100 1 1 23.2 0.8
88 1 2 22.6 0.9
98 2 1 10.6 1.1
83 2 2 11.1 1.3
y
is a continous outcome variable, id
is the patient id, time
is the time point of oberservation and V1
and V2
are covariates.
In this data example we have 2 patients that have 2 observations each, one at time 1 and another one at time 2.
My real data sets has hundreds of patients (ids) and 2-5 observations (time) for each patient.
I now know that the effect of V1
on y
is non-linear and what to model this with a GAM.
In the mgcv
package there is a function called gamm()
. In my example, I use it like this:
m <- gamm(y ~ s(V1) + V2 + time ,family=gaussian,
data=dat,random= ~(time|id))
Is this correct?
Does this model integrate the fact that V1
can change over time for each individual?
Best Answer
What @Roland is getting at is to use a random spline basis for the
time
byid
random part. So your model would become:This model says that the effect of
time
is smooth and varies byid
, with a separate smooth being estimated for eachid
but each smooth is assumed to have the same wiggliness (a single smoothness penalty is estimated for all thetime
smoothers) but can differ in shape.To estimate a separate "global" effect the model could be
If you want similar models but where each smooth can have different wiggliness as well as shape, then the
by
smoothers can be used:The
m=1
means that the smoother uses a penalty on the squared first derivative, which penalises departure from a flat function of no effect. As this is on the subject specific smooths, the model is penalising deviations from the "global" smooth.Some colleagues and I have described these models in some detail in a paper submitted to PeerJ, which is available as a preprint. A new version in response to reviewers comments should be up in a few days (we've submitted it to the journal).