Which survey is more accurate? Assume the samples are taken perfectly randomly.
-
A sample of 100 people out of a population of 1000 (sample is 10% of population)
-
A sample of 1000 people out of a population of 1000000 (sample is 0.1% of population)
I remember my lecturer saying something like "when the sample size is small compared to the population, the accuracy depends almost all on the sample size, the population size is unimportant" . Is there a name for that result? It's quite surprising at first.
I'd love to see some graphs of these functions.
If it helps, here's a concrete example (made up by me).
An unknown proportion p of the population favour candidate Alice. The rest favour Bob. We take a random sample size k of the population (size n), and ask their preferences, to come up with an estimate p-hat.
How does the expected error $\mathbb{E}|\hat{p} – p|$ depend on k and n? And in the limit $n\to\infty$?
Best Answer
Note: For convenience only I use in the following $N$ for the size of the population and $n$ for the sample size.
In order to answer OPs questions we start with some preliminary work and describe the current situation in somewhat more detail.
Before we can answer OPs questions we have to do some general
Now, since $d$ and the expression (2) provide us with a precise idea of accuracy, we are ready to harvest.
Observe, that the variance of the estimator $\hat{\theta}$ decreases with an increasing sample size $n$, so that the inequality above will be satisfied if we can choose $n$ large enough to make \begin{align*} z\sqrt{\mathop{var}(\hat{\theta})}\leq d\tag{3} \end{align*}
These are the relevant parameters to deal with accuracy. Next we consider
Note: When $N$ is large compared with the sample size $n$ then formula (3) reduces to
\begin{align*} n&\simeq \lim\limits_{N\rightarrow \infty}\frac{1}{\frac{N-1}{N}\cdot\frac{1}{n_0}+\frac{1}{N}}=n_0 \end{align*} Since then $n=n_0$ we obtain \begin{align*} n=\frac{z^2}{d^2}p(1-p)\tag{4} \end{align*} and we see in accordance with OPs lecturer, that in case the sample size $n$ is small compared with the population size the accuracy $d$ depends on the sample only.
With regard to one of OPs questions I'm not aware of a specific term for this circumstance. But, sometimes this is named finite population correction.
Note: This answer is mostly based upon Sampling, chapter 5: Estimating Proportions, Ratios and Subpopulation Means by Steven K. Thompson.