Correlation – What is the Bayesian Counterpart to a Two-Sample t-Test with Unequal Variances?

bayesiancorrelationheteroscedasticityt-test

I am looking for the bayesian counterpart of the two-sample t-test with unequal variances (the Welch test). I am also looking for a multivariate test, like Hotelling's T statistic. References appreciated.

For the multivariate case, suppose that we have $(y_1,\cdots,y_N)$ and $(z_1,\cdots,z_N)$, where $y_i$ (resp $z_i$) is a shortcut for a sample mean, sample standard deviation and number of points. We can assume that the number of points is constant across the whole dataset, the standard deviation the same for all $y_i$ (resp $z_i$) and that the sample means of the $y_i$ (resp $z_i$) are correlated. If you plot the sample means, they follow each other and by connecting them, you get a smooth varying function.
Now on some parts the $y$ function agrees with the $z$ function, but on others it doesn't, because $\frac{mean(y_i)-mean(z_i)}{std(y_i)+std(z_i)}$ becomes big. I would like to quantify this statement.

Best Answer

While you can do this in a Bayesian way, have you considered whether it would actually be better to estimate the difference in the means rather than test whether they are different? This is what Andrew Gelman frequently recommends. I can imagine some possible reasons for wanting to do hypothesis testing, but I don't think they're that common.

I don't think you need something like a t-test, because you can estimate the standard deviation well because you said the groups have very similar standard deviations.

If that's the case then I think this link should be what you need. It shows how to estimate a difference in means or do a hypothesis test (though I don't recommend this). You could also take a look at the part they reference in bolstad's book (you can find electronic copies online). Its possible to incorporate estimating the variances as well but it's more complex, so I suspect you're better off incorporating the prior information you have about the variances in a naive way (for example, using the unbiased Stdev estimator on each of the sets and then averaging them and pretending those are your 'known' stdevs).

Related Question