Solved – Repeated Measures ANOVA post hoc test (bayesian)

anovabayesian

I am trying to understand the procedure of carrying out a Bayesian Repeated Measures ANOVA. In a conventional repeated measures ANOVA, I calculate the effect of a certain parameter (e.g., study condition) on a measured variable (e.g., test score). Then, I follow up my test with a post hoc test if the p value is lower than a certain value. This post hoc test will tell me the effect of each individual study condition. Therefore, I do not only know that the study condition had an effect on the measured results – I am also able to infer which condition had the most effect.

For Bayesian Repeated Measures ANOVA, it is not clear to me how to follow up the initial test. I can do a 'model comparison', which gives me a Bayesian Factor to indicate the effect of that specific model. Lets say study condition is found to have a high BF – now I am of course interested which study condition performed better / worse than the others. How do I approach this question?

Additional information about the study: This concerns a within-subjects, full repeated measures design. The study consisted of a total of three conditions – with participants completing the different conditions in different orders (i.e., balanced latin square).

(I used JASP software for my analysis, whereas the regular repeated measures ANOVA offers the possibility for post hoc testing, this is not available in the Bayesian repeated measures ANOVA.)

Best Answer

First, why do you want to perform a Bayesian analysis? If you want to know which of your conditions is "significantly different" from the others, the repeated-measures ANOVA with planned contrasts gives you the answer. The strength of the Bayesian approach is not in this kind of null-hypothesis significance testing (NHST), but rather in parameter estimation. The basic idea is that under the Bayesian approach, you are never boiling down your model to binary decisions about accepting/rejecting models, nor assuming "point estimates" for your parameters, allowing you to evaluate the performance of your model in more detail. For more information, see Kruschke's excellent book or this paper, in particular the section entitled "Bayesian estimation generally". Previous questions on Cross Validated are well worth a read (here and here).

Now onto your specific problem. Let $n$ represent the number of subjects, $m$ represent the number of conditions (in this case, $m=3$), and $Y$ represent a stacked vector of your data, organized like this:

$$ Y = \begin{pmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{n\cdot{}m} \end{pmatrix}, $$

where the $y_1$ to $y_m$ are the values from the first subject, $y_{m+1}$ to $y_{2m}$ are the values from the second subject, and so on. We assume that the data are normally distributed,

$$Y\sim\mathcal{N}\left(\vec{\mu},\sigma^2\right),$$

where $\vec{\mu}$ is a stacked vector of means and $\sigma^2$ is the variance. The mean vector is given by

$$\vec{\mu}=\textbf{X}\vec{\beta},$$

where $\textbf{X}$ is a design matrix and $\vec{\beta}$ is a row vector of regression coefficients. The design matrix is a bit complicated due to the repeated-measures design. Each row represents a data point in $Y$. The first $n$ columns in the matrix represent subject-specific "intercepts"; for example, in the first column, the first three values are 1 and all the rest are 0. The remaining columns represent conditions.The first condition is absorbed by the intercept, so the $(n+1)$th column has a 1 for every data point belonging to condition 2, the $(n+2)$th column has a 1 for every data point belonging to condition 3, and so on.

The row vector $\vec{\beta}$ looks like

$$\vec{\beta} = \begin{pmatrix} \beta_1 & \beta_2 & \cdots & \beta_{n\cdot{}m-1} \end{pmatrix},$$

where each entry is a normally distributed random variable with a suitable prior. Finally, we need to assign a prior to $\sigma$, such as a half-Cauchy.

And that's it! You can construct this model in BUGS, JAGS, Stan, or PyMC, and sample the joint posterior using MCMC. To determine whether there was a difference between, for example, condition 1 and condition2, look at the marginal posterior distribution of $\beta_{n+1}$. This represents the group-level average difference between these conditions. If the 95% credible interval of this distribution excludes 0, you can conclude that there is probably a meaningful difference between condition 1 and condition 2.

PS: I really, really hate the term "Bayesian ANOVA". I know it crops up all the time and there is nothing wrong with it technically, but I find that for the uninitiated, it implies a too much equivalence between the frequentist and Bayesian approaches.

Related Solutions

Solved – Repeated Measures ANOVA and the Bonferroni post hoc test different results of significantly

Your t-tests are ignoring the repeated measures in the data. Use paired=TRUE. And just use the Bonferroni-Holm default adjustment; there should be no reason to use regular Bonferroni.

pairwise.t.test(x=table.metric2$value, g=table.metric2$time, paired=TRUE)
##  
##         Pairwise comparisons using paired t tests 
##  
## data:  table.metric2$value and table.metric2$time 
##  
##     Su1    Su2    Su3    Su4    Su5   
## Su2 0.0124 -      -      -      -     
## Su3 0.0241 0.0032 -      -      -     
## Su4 0.9299 0.0115 0.0953 -      -     
## Su5 0.9299 0.0087 0.0937 0.9299 -     
## Su6 0.0038 5e-05  0.9299 0.0186 0.0074
##  
## P value adjustment method: holm

But also see Peter Flom's excellent answer to a related question. These tests get at different things so will not always "agree." Better to decide which test is most appropriate for your question.

As mentioned in the comments, the mixed model assumes equal variance, but the pairwise paired t-tests do not. To get pairwise tests based on the model, which would keep this assumption, you can use the multcomp library, like this (output not shown).

library(multcomp)
summary(glht(aov2, linfct=mcp(time="Tukey")))

In this case, however, they give different results (if you define different as on different sides of the nominal 0.05 cutoff); this is because the variance of the differences between each pair are not equal, the standard deviations have a range of about 3.

ss <- summary(glht(aov2, linfct=mcp(time="Tukey")))
foo <- pairwise.t.test(x=table.metric2$value, g=table.metric2$time, 
                       paired=TRUE)
foo <- as.data.frame(as.table(foo$p.value))
foo <- foo[!is.na(foo$Freq),]
names(foo)[3] <- "pval.pairedt"
foo$pval.pairedt <- format.pval(foo$pval.pairedt, digits=3)
foo$pval.mixed <- format.pval(ss$test$pvalues, digits=3)
foo$Var1 <- factor(foo$Var1, levels=levels(table.metric2$time))
foo$Var2 <- factor(foo$Var2, levels=levels(table.metric2$time))
foo$diff.sd <- sapply(1:nrow(foo), function(i) {
    y1 <- table.metric2$value[table.metric2$time==foo$Var1[i]]
    y2 <- table.metric2$value[table.metric2$time==foo$Var2[i]]
    sd(y1-y2, na.rm=TRUE)
})
print(foo, row.names=FALSE)
## Var1 Var2 pval.pairedt pval.mixed   diff.sd
##  Su2  Su1      0.01239   0.727461 0.4963881
##  Su3  Su1      0.02409   0.008360 1.3264442
##  Su4  Su1      0.92986   0.962717 0.7934766
##  Su5  Su1      0.92986   0.999487 0.7164677
##  Su6  Su1      0.00383   0.000114 1.3635100
##  Su3  Su2      0.00316   2.21e-05 1.4763516
##  Su4  Su2      0.01150   0.227215 0.7914265
##  Su5  Su2      0.00867   0.512340 0.5840229
##  Su6  Su2     4.97e-05   4.83e-08 1.3861764
##  Su4  Su3      0.09527   0.102065 1.3110117
##  Su5  Su3      0.09366   0.024984 1.4534286
##  Su6  Su3      0.92986   0.896211 1.2320563
##  Su5  Su4      0.92986   0.996275 0.7751331
##  Su6  Su4      0.01860   0.003817 1.3068406
##  Su6  Su5      0.00743   0.000494 1.3654174

Solved – Repeated measures ANOVA contrasts

Recall, in repeated measures ANOVA, there is a within-group effect $(W)$, a between group-effect $(B)$, and the interaction $(B*W)$. The within-group effect reveals how the response values change over time within each group. If responses never change within a particular group and are constant, then $W$ won't be significant. For the between-group effect $B$, this basically compares ovarall mean response differences between groups. The real focus is the $B*W$ interaction, since it lets you know if the groups differ in their reponse over time, such as linear, quadratic, or U-shaped dose-response curves. For contrasts, they are by definition, all post hoc, so you can do whatever you want. Yes, you can report significant all-possible mean comparisons between groups and times whatever fashion you want -- just make sure you state that e.g. "during post-hoc analyses, we identified significantly different means between sub-groups x,y,z and/or times 1,2,3." The key is to preface your statement with "post-hoc," after which you can literally perform any tests you desire.

Best Answer

Related Solutions

Solved – Repeated Measures ANOVA and the Bonferroni post hoc test different results of significantly

Solved – Repeated measures ANOVA contrasts

Related Question