Hypothesis Testing – Proof of Student t-test for Independent Samples with Non-zero Mean

expected valuehypothesis testingmathematical-statisticsnormal distributiont-distribution

I'm following the proof in Cramer's book Mathematical Methods of Statistics, $\S 29.4$. There it is assumed that we have two independent samples $x_1,\ldots, x_{n_1}$ and $y_1,\ldots,y_{n_2}$ drawn from the same normal population $N(\mu,\sigma^2)$. The author claims that we may assume without loss of generality that $\mu = 0$, but I don't see why. I understand that we may shift the random variables by $\mu$ to centralize them, but this is not usually done when calculating this statistic, and I'm trying to figure out why the statistic follows a t-distribution in the general case. I have tried to imitate the proof without the $\mu = 0$ assumption, and the main lines are as follows:

Consider the quadratic form $$Q = n_1 s_1^2 + n_2 s_2^2 = \sum_1^{n_1} x_i^2 + \sum_1^{n_2} y_i^2 – n_1 \overline x^2 – n_2 \overline y^2$$(using uncorrected variances for simplicity), then a transformation partially defined by $z_1 = \sqrt n_1 \overline x$ and $z_2 = \sqrt n_2 \overline y$ and extended to a full orthogonal transformation $(x_1,\ldots,x_{n_1},y_1,\ldots,y_{n_2}) \mapsto (z_1,\ldots,z_{n_1+n_2})$ maps the quadratic form to $$Q = \sum_3^{n_1+n_2} z_i^2,$$ which shows that its rank (the number of degrees of freedom) is $\nu = n_1+n_2-2$. Then define the statistic via the usual formula and do some algebra, $$t = \frac{\overline x – \overline y}{\sqrt{\frac{n_1 s_1^2 + n_2 s_2^2}\nu} \sqrt{\frac1{n_1} + \frac1{n_2}}}= \frac{\sqrt\frac{n_2}{n_1+n_2}z_1 – \sqrt\frac{n_1}{n_1+n_2}z_2}{\sqrt{\frac 1\nu \sum_3^{n_1+n_2} z_i^2}}.$$

One can check that the numerator is a centered distribution, $\sqrt\frac{n_2}{n_1+n_2}z_1 – \sqrt\frac{n_1}{n_1+n_2}z_2 \sim N(0,\sigma^2)$, however one would also hope for the $z_i \sim N(0,\sigma^2)$ in the denominator in order to conclude that this follows a t-distribution with $\nu$ degrees of freedom. Arguing via covariance matrices I see why the orthogonal transformation preserves the variance, but I initially failed to see why the expected value of the $z_i$ $(i \geq 3)$ has to be zero. I finally managed to prove it by noting the following observation:

An orthogonal matrix $C$ of size $n_1+n_2$ whose first two rows are $\frac1{\sqrt n_1}(1,1,\ldots,1,0,0,\ldots0,0)$ ($n_1$ nonzero entries) and $\frac1{\sqrt n_2}(0,0,\ldots,0,1,1,\ldots,1)$ ($n_2$ nonzero entries) has the property that the coefficients in each other row sum to zero. Proof: this equivalently says that $C$ should map the vector $(1,1,\ldots,1)$ to $(\sqrt n_1, \sqrt n_2,0,0,\ldots,0)$. But this is true since $C$ must preserve the norm, as $\|(1,1,\ldots,1)\| = n_1+n_2 = \|(\sqrt n_1, \sqrt n_2, ?,?,\ldots,?)\|$ forces all other entries to vanish.

In our situation, this forces $E(z_i) = \sum_j C_{ij}E(x_j) = \mu \sum_j C_{ij} = 0$ whenever $i \geq 3$ no matter what $\mu$ is. But this seems a bit convoluted.

In summary, my question is:
Is there a more straightforward approach to understanding why the statistic defined above follows a t-distribution even in the case $\mu \neq 0$, or why the author claims that $\mu = 0$ may be assumed without loss of generality? It seems quite nontrivial to me.

Thanks.

Best Answer

The simple answer is that the statistic does not depend at all on $\mu$, and this is much easier to see from the original, non-transformed formula:

$$t = \frac{\overline x - \overline y}{\sqrt{\frac{n_1 s_1^2 + n_2 s_2^2}\nu} \sqrt{\frac1{n_1} + \frac1{n_2}}}.$$

Indeed under the transformations $x_i \mapsto x_i - \mu$ and $y_i \mapsto y_i - \mu$, we have $\overline x \mapsto \overline x - \mu$ and $\overline y \mapsto \overline y - \mu$, and also $s_1^2 = \frac1n \sum (x_i - \overline x)^2 \mapsto s_1^2$ and similarly $s_2^2 \mapsto s_2^2$, so that the variable $t$ is invariant under horizontal shifts of the parent distribution. This is why we can assume $\mu = 0$ without loss of generality.