[Math] Testing whether the two population means are the same using sampling distribution of difference between two means

probability distributionssampling

The problem is

enter image description here

Given the above data, can we conclude that the two population means are equal?

And my question is, how can I solve this question using the sampling distribution of the difference between two means?

I found the variance for the difference of two means:

variance = $\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}$
= (variance of sample 1) + (variance of sample 2)
= 125.5 + 104.5 = 230

And since the test is whether or not their means are the same, the assumption would be that the population means are the same. So the mean for the sampling distribution would be $\mu_1-\mu_2=0$.

But from here I got confused because the data provides the 'averages' (which is from the samples I think), and I'm not sure what to do with the sample means. Also, I was planning to find P$((\text{sample mean 1} – \text{sample mean 2})\gt \text{variance}))$ and if this probability is large, conclude that the population means are not the same. But then there's the problem that their variances aren't homogeneous. How can I proceed from here? (by only using the methods of the sampling distribution and not by hypothesis testing!)

Best Answer

Let $\mu_1$, $\mu_2$ be the population means (i.e. true mean heat producing capacity) of Mines $1$ and $2$, respectively. Let $\bar x_1 = 8230$, $\bar x_2 = 7940$ be the observed sample means from samples of sizes $n_1 = 5$ and $n_2 = 6$, respectively, from Mines $1$ and $2$. Finally, let $s_1 = 125.5$ and $s_2 = 104.5$ be the observed sample standard deviations of the heat-producing capacity.

The hypothesis to be tested is $$H_0 : \mu_1 = \mu_2 \quad \text{vs.} \quad H_a : \mu_1 \ne \mu_2,$$ and the test statistic we will employ is the Welch's t-test $$T \mid H_0 = \frac{\bar x_1 - \bar x_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} \sim \operatorname{StudentsT}(\nu),$$ where $$\nu \approx \frac{\left(\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}\right)^2}{\frac{s_1^4}{n_1^2(n_1 - 1)} + \frac{s_2^4}{n_2^2(n_2 - 1)}}$$ is the Satterthwaite approximation for the degrees of freedom. The critical value for this test is $t_{\nu, \alpha/2}^*$, the upper $\alpha/2$ quantile for the Student's $t$ distribution with $\nu$ degrees of freedom. If $|T| > t_{\nu, \alpha/2}^*$, then we reject $H_0$ at the $100(1-\alpha)\%$ confidence level and conclude that the true means are unequal. We may also compute a $p$-value for the test; I obtained $$p \approx 0.00350541.$$

The justification for using the Welch test statistic is that the sample variances are not similar in magnitude. The resulting $p$-value is therefore larger than a test based on the usual two independent sample $t$-test.

Related Question