If I want to compare two sets of measurements, i.e., how much their means differ or how much the sets differ, I would say I have two options:
-
Paired difference t-test: Calculate the differences to get the change score for each element and then use the set of differences to build the t statistic (with the mean and standard deviation of the differences) and compare with the t-distribution. The degrees of freedom would be n-1.
-
Similar approach, but calculating means and standard deviation separately for both samples. Build the the statistic using the two means and standard deviations. The degrees of freedom would be 2n-2.
What are the differences between these two approaches?
Best Answer
To readers : please note the hierarchy of the answer :-)
Suppose $X\sim N(\mu_x,\sigma_x^2)$ and $Y\sim N(\mu_y,\sigma_y^2)$. For simplicity, suppose $\sigma_x^2=\sigma_y^2=\sigma^2$, which is unknown. Suppose the two samples are $\mathbb{X}=\{X_1,\dots,X_m\}$ and $\mathbb{Y}=\{Y_1,\dots,Y_n\}$. We are testing $H_0: \mu_x-\mu_y=0$
when $m=n$
if $\mathbb{X}$ and $\mathbb{Y}$ are independent, then
Which test should we choose?
A simulation study is conducted to compare the two test procedures in term of statistical power. Let $X\sim N(0, 1)$, $y\sim N(\mu_y, 1)$, let $n=10$ and let $\alpha = 0.05$. The x-axis represents the true value of $\mu_y$ ($H_0: \mu_x = \mu_y$ is true when $\mu_y = 0$), and the y-axis shows the percentages of rejecting $H_0$ using the 2 test procedures. As you can see, when $H_0$ is true ($\mu_y=0$), the two procedures have almost the same type I error; when $H_0$ is false, 2-sample t-test has more chance of rejecting $H_0$.