Estimation – How to Estimate a Population Total from a Stratified Simple Random Sample

estimationpopulationsamplestratification

I have a stratified population. I want to estimate a population total $T$ from a stratified simple random sample. I have two strategies:

  1. I compute $\displaystyle T=\sum_h N_h\bar x_h$ where $x_h$ is the sample mean of the stratum $h$ and $N_h$ is the number of elements of the population in stratum $h$.

  2. I compute $T$ as the sum of all observations where the sampled elements are given their actual value and each remaining non sampled element is given the value of the sample mean of the stratum to which it belongs.

I want to know which strategy is better to estimate the population total $T$. Thanks!

Best Answer

Those two sums are mathematically equivalent

To analyse your proposal, I will call your two totals $T_1$ and $T_2$ respectively. Suppose we index the population values as $x_{h,i}$ where $h=1,...,H$ is the stratum and $i = 1,...,N_h$ is the corresponding value number. We will denote the imputed value for the second total as:

$$\tilde{x}_{h,i} = \begin{cases} x_{h,i} & & \text{for } 1 \leqslant i \leqslant n_h, \\[6pt] \bar{x}_h & & \text{for } n_h < i \leqslant N_h. \\[6pt] \end{cases}$$

Then your proposed totals are:

$$\begin{align} T_1 &\equiv \sum_{h=1}^H N_h \bar{x}_h, \\[6pt] T_2 &\equiv \sum_{h=1}^H \sum_{i=1}^{N_h} \tilde{x}_{h,i}. \\[6pt] \end{align}$$

We now demonstrate that these are equivalent:

$$\begin{align} T_2 &= \sum_{h=1}^H \sum_{i=1}^{N_h} \tilde{x}_{h,i} \\[6pt] &= \sum_{h=1}^H \Bigg[ \sum_{i=1}^{n_h} x_{h,i} + \sum_{i=n_h+1}^{N_h} \bar{x}_h \Bigg] \\[6pt] &= \sum_{h=1}^H \Bigg[ n_h \bar{x}_h + (N_h-n_h) \bar{x}_h \Bigg] \\[6pt] &= \sum_{h=1}^H N_h \bar{x}_h \\[6pt] &= T_1. \\[6pt] \end{align}$$