Solved – a pivotal statistic

estimationhypothesis testinginference

I'm currently reading "Computer Age Statistical Inference" by Efron and Hastie.

In section 2.1, they talk about some of the mechanisms that frequentist inference uses to circumvent the defect of having to calculate properties of estimators $\hat{\Theta} = t(\mathbf{X})$ of an unknown distribution $F$.

One of these mechanisms, they say, is the use of pivotal statistics, which they define as not depending upon the underlying probability of $F$. They give an example of a two-sample $t$ statistic as one that does not depend on the underlying distribution.

More specifically, they say that:

  1. given two normally distributed i.i.d. samples $\mathbf{x_1} = (x_{1,1}, \ldots x_{1,n})$ and $\mathbf{x_2} = (x_{2,1}, \ldots x_{2,n})$.

  2. given the null hypothesis $H_0: \mu_1 = \mu_2$

  3. a test statistic $\hat{\theta} = \bar{\mathbf{x_2}} – \bar{\mathbf{x_1}}$ would have distribution $\sim N\Big(0,\sigma^2(\frac{1}{n_1} + \frac{1}{n_2})\Big)$ under the null hypothesis. In this case $\sigma$ would be non-pivotal

  4. Then he proposes that the way we do $t = \frac{\bar{\mathbf{x_2}} – \bar{\mathbf{x_1}}}{\hat{\sigma}(\frac{1}{n_1} + \frac{1}{n_2})^{1/2}}$, is pivotal.

I really didn't understand the difference between these two, and thus, as I didn't really understand how $t$ doesn't rely on the sample.

Best Answer

The population standard deviation $\sigma$ depends on the (unknown) distribution, $E$. The sample standard deviation $\hat{\sigma}$ depends only on the (known) data, $\boldsymbol{x}$.

Because it is a consistent estimator, $\hat{\sigma}\to\sigma$ as sample size goes to infinity. But with a finite sample, only $\hat{\sigma}$ is knowable.

Related Question