I would like some advice on what statistical analysis to use to test my hypothesis. I am using R. My setup is as follows (in .csv format, first line is header):
id,time,group,outcome
1,t1,control,3.4
1,t2,control,3.2
2,t1,treatment,3.4
2,t2,treatment,4.2
3,t1,control,3.3
3,t2,control,3.1
4,t1,treatment,3.2
4,t2,treatment,4.5
The hypothesis I want to test is if a treatment (say the effectiveness of an exercise, E) has increased the 'outcome' variable significantly. Normally I would just use a paired t-test for the 'treatment' group to see if the difference (increase) is significant, since I took repeated measure from the same subject (i.e. pre and post test). However, I have found out that there exists a 'practice effect' in the way I measure the outcome, causing the 'outcome' variable to decrease between measurements, without any intervention of the exercise E. Therefore I would like to account for this practice effect by having a separate 'control' group. Basically, I want to find out if the 'outcome' variable has increased significantly between t1 and t2 for the 'treatment' group, taking into account the practice effect measured in the 'control' group. Conceptually (sorry – I may not be using the correct technical terms here), say the 'outcome' variable increased from M1=3.2 to M2=3.4 in the treatment group. And decreased from M1=3.2 to M2=3.1 in the control group. I want to account for this by adding this 0.1 decrease to the 0.2 increase of the treatment group, and test if it is significant.
I have been researching this myself and read somewhere that I could use the Factorial ANOVA, but not sure how do I find the p-value for what I want to test. I have also read that I can use the Linear Mixed Model (lme) with id as random effect, but am totally confused by this as well. I am a newbie in R and semi-newbie in statistics.
Any help or advice would be very much appreciated! Thank you very much.
Best Answer
I assume that
t1
can be encoded as zero time, i.e., there is no treatment or training effect at this time point (i.e., these are "baseline measurements"). Thus:Now we fit a mixed effects model that consists of:
Judging from the plot above you should also include random slopes, but you don't have sufficient data for that. I hope in reality you have more data (many more individuals).
As you see, the model would tell you that your "practice effect" (slope
time1
) is not significant but the treatment has a highly significant effect on the slope. I believe this is what you want to know.(The random intercept variance is very small because there is no random slope although one is needed. This is a problem. However, I'd be confident in the conclusion as it is already obvious when plotting the data.)