Solved – How to use ANOVA with two or more data points per condition per participant

anovarepeated measures

I'm preparing my first ever experiment for my PhD and am currently facing some difficulties with statistics. Here is my draft experimental design:

3 factors with 5, 3, 3 levels. As a result I have 45 conditions. (These numbers should be irrelevant for my question).
For each condition, I measure the time it takes a participant to answer a question. These questions are in a sense "generated" by the condition and are different for the different conditions.
This is a repeated-measures study, where all participants are measured in all conditions. So I get 45 data points per participant.

Here is the interesting bit: For each condition I can come up with multiple possible questions that test the response time in that condition. So I figured, why not ask each participant multiple questions per condition and record multiple response time values. More data points should be better after all. So I could for example get 90 data points per participant, 2 for each condition.

However, I am not sure how to handle this during the analysis:

Is it a good idea to do these multiple measurements per condition and participant? Each question is rather short, so fatigue should not be an issue.
How to best analyze such data? Should I just take the average of the multiple data values? Another option would be to pretend I have double the number of participants, but then a "pair" of participants is not independent.
Would this have an effect on how many participants I need?

Any other advise on this issue is much appreciated.

Best Answer

Yes, more data as a general rule is usually better and given that you have just one answer per condition, then the move to get 2 data points per condition for each participant is a good idea. Its not just 1 more data point, but also double the amount of data on which to model a response per person.
You could take the average of the participant's answers. A strict answer to that depends on the variability between the answers. BUT, why bother, just add another factor for each participant 'question number' with two levels '1' and '2'. If there is no difference between the answer order, then the whole model will be the same as taking the mean. On the other hand, if the there is a systematic difference between the answers (for some reason), you can find out about that as well, essentially for free.
'Need' is a tricky word here, but yes, this could potentially decrease the amount of participants you need. Asking more questions increases your N, having them be within subject comparisons is even better (usually). So by asking more questions per person you should be reducing your variance, and thus increasing the likelihood of finding a significant model.

Context of my answer

I self-studied this question yesterday (the part concerning the possibility to use mixed models here). I shamelessly dump my fresh new understanding on this approach for 2x2 tables and wait for more advanced peers to correct my imprecisions or misunderstandings. My answer will be then lengthy and overly didactic (at least trying to be didactic) in order to help but also expose my own flaws. First of all, I must say that I shared your confusion that you stated here.

I've read about multi-level models, which sound like they are intended the handle this situation when the underlying variables are continuous (e.g., real numbers) and when a linear model is appropriate

I studied all the examples from this paper random-effects modelling of categorical response data. The title itself contradicts this thought. For our problem with 2x2 tables with repeated measurement, the example in section 3.6 is germane to our discussion. This is for reference only as my goal is to explain it. I may edit out this section in the future if this context is not necessary anymore.

The model

General Idea
The first thing to understand is that the random effect is modelled not in a very different way as in regression over continuous variable. Indeed a regression over a categorical variable is nothing else than a linear regression over the logit (or another link function like probit) of the probability associated with the different levels of this categorical variable. If $\pi_i$ is the probability to answer yes at the question $i$, then $logit(\pi_{i})= FixedEffects_i + RandomEffect_i$. This model is linear and random effects can be expressed in a classical numerical way like for example $$RandomEffect_i\sim N(0,\sigma)$$ In this problem, the random effect represents the subject-related variation for the same answer.

Our case
For our problem, we want to model $\pi_{ijv}$ the probability of the subject to answer "yes" for the variable v at interview time j. The logit of this variable is modeled as a combination of fixed effects and subject-related random effects.
$$logit(\pi_{ijv})=\beta_{jv}+u_{iv}$$

About the fixed effects

The fixed effects are then related to the probability to answer "yes" at time j at the question v. According to your scientific goal you can test with a likelihood ratio to test if the equality of certain fixed effects must be rejected. For example, the model where $\beta_{1v}=\beta_{2v}=\beta_{3v}...$ means that there is no change tendency in the answer from time 1 to time 2. If you assume that this global tendency does not exist, which seems to be the case for your study, you can drop the $i$ straightaway in your model $\beta_{jv}$ becomes $\beta_{v}$. By analogy, you can test by a likelihood ratio if the equality $\beta_{1}=\beta_{2}$ must be rejected.

About random effects

I know it's possible to model random effects by something else than normal errors but I prefer to answer on the basis of normal random effects for the sake of simplicity. The random effects can be modelled in different ways. With the notations $u_{ij}$ I assumed that a random effect is drawn from its distribution each time a subject answer a question.This is the most specific degree of variation possible. If I used $u_{i}$ instead, it would have mean that a random effect is drawn for each subject $i$ and is the same for each question $v$ he has to answer (some subjects would then have a tendency to answer yes more often). You have to make a choice. If I understood well, you can also have both random effects $u_{i}\sim N(0,\sigma_1)$ which is subject-drawn and $u_{ij}\sim N(0,\sigma_2)$ which is subject+answer-drawn. I think that your choice depends of the details of your case. But If I understood well, the risk of overfitting by adding random effects is not big, so when one have a doubt, we can include many levels.

A proposition

I realize how weird my answer is, this is just an embarrassing rambling certainly more helpful to me than to others. Maybe I ll edit out 90% of it. I am not more confident, but more disposed to get to the point. I would suggest to compare the model with nested random effects ($u_{i}+u_{iv}$) versus the model with only the combinated random effect ($u_{iv}$). The idea is that the $u_i$ term is the sole responsible for the dependency between answers. Rejecting independence is rejecting the presence of $u_{i}$. Using glmer to test this would give something like :

model1<-glmer(yes ~ Question + (1 | Subject/Question ), data = df, family = binomial)
model2<-glmer(yes ~ Question + (1 | Subject:Question ), data = df, family = binomial)
anova(model1,model2)

Question is a dummy variable indicating if the question 1 or 2 is asked. If I understood well, (1 | Subject/Question ) is related to the nested structure $u_{i}+u_{iv}$ and (1 |Subject:Question) is just the combination $u_{iv}$. anova computes a likelihood ratio test between the two models.

Best Answer

Related Solutions

SPSS – How to Correctly Treat Multiple Data Points per Subject

Solved – How to handle multiple measurements per participant, with categorical data

Context of my answer

The model

About the fixed effects

About random effects

A proposition

Related Question