Probability Sampling – Determining the Probability of a Unit Being Selected in a Sample

mathematical-statisticsprobabilitysampling

I recently studied sampling from one of online courses and got to know about various sampling methods but i'm confused regarding the probability of an element being selected in a sample. In that course the lecturer just wrote down that the probability of an element being selected in a sample of size $n$ from the population of size $N$ for both with replacement(order matter) and without replacement(order does not matter) is $n/N$. I'm not able to derive it and also not able to get the results for the remaining cases. Assume that the set A = {$a_1,a_2,a_3,….,a_N$} be the population set and you sample $n$ elements from this set using one of the below:

  1. Simple Random Sampling with Replacement(order matters)
  2. Simple Random Sampling with Replacement(order does not matter)
  3. Simple Random Sampling without Replacement(order matters)
  4. Simple Random Sampling without Replacement(order does not matter)

So what is the probability that an element, say $a_i$ is one of the $n$ elements being sampled using above methods.

Edit: (what i mean by order matters)

take population set to be A = {1,2,3} now select a sample of size 2 these can be {1,1}{1,2}{1,3}{2,1}{2,2}{2,3}{3,1}{3,2}{3,3}(depending on the method used) now in the case of order matters {1,2} and {2,1} are counted twice but in case where order does not matter these are counted only once so the samples become {1,1}{1,2}{1,3}{2,2}{2,3}{3,3}

Edit 2: Further clarification

Suppose set A={1,2,3} and then simple random sampling is performed selecting 2 elements using one of the above methods, i want to know the probability of selection of each element. In short what is probability of selection of "1" when sampling is performed. For case (4) it is easy to derive:

number of samples = 3 ({1,2},{1,3},{2,3})
number of samples in which "1" is present = 2 ({1,2},{1,3})
so the probability of selection = 2/3 = n/N (n = 2, N = 3)

but how to derive the probability of including a specified unit in a sample(in general) for this case and also for other cases.

Best Answer

Suppose the population is $a_{1}, a_{2}, \ldots, a_{N}$. We choose a sample of size $n$ from the population using simple random sampling and want to know the probability that a specific element, say $a_{1}$, is in the sample.

First, note that the probability of an element being included in the sample is the same whether the sample is ordered or unordered. (One way to see that is to note that we can change ordered samples into unordered samples by ignoring the ordering. If we know the probability $a_{1}$ is in any of the ordered samples, that must also be the probability that $a_{1}$ is in any of the unordered samples.) So we can find just one probability and get the result for both ordered and unordered cases.

Sampling without replacement (unordered)

There are ${N}\choose{n}$ possible ways to choose an unordered sample of size $n$ from a population of size $N$ without replacement. Crucially, we also know that each one of those options are equally likely. How many ways are there to choose a sample containing $a_{1}$? If we must include $a_{1}$ then we have to choose the remaining $n-1$ elements from the remaining $N-1$ possibilities, which gives ${N-1}\choose{n-1}$ possible samples. Because we know that each possibility is equally likely, we can find the probability of choosing $a_{1}$ by taking the ratio

$$\frac{{N-1}\choose{n-1}}{{N}\choose{n}} =\frac{n}{N}$$

Sampling with replacement (ordered)

There are $N^n$ possible samples and all of them are equally likely. Finding the number of possible samples that contain $a_{1}$ is a bit more tricky, but can be done by finding the complement instead. If we find the number of samples that don't contain $a_{1}$ and subtract that from $N^n$, we will have the number of samples that do contain $a_{1}$. That can be done since the number of samples that don't contain $a_{1}$ is $(N-1)^n$, giving the number of samples that do contain $a_{1}$ as $N^{n} - (N-1)^n$. This gives the probability of $a_{1}$ being in the sample as

$$\frac{N^{n} - (N-1)^n}{N^{n}} = 1-\left(\frac{N-1}{N}\right)^n$$

By symmetry these probabilities are the same for every $a_{i}$.

Related Question