The exact distribution of the random number $X$ of unsatisfactory oranges drawn from the sample is binomial, under the assumption that there are many, many more oranges than the sample size $n = 125$. The lot will be rejected if $X > 10$. Assuming that the true proportion of unsatisfactory oranges is $p = 0.09$, the resulting probability of rejection is $$\Pr[X > 10] = 1 - \sum_{x=0}^{10} \binom{125}{x} (0.09)^x (0.91)^{125-x} \approx 0.576514.$$ Using the normal approximation, $$X \dot\sim \operatorname{Normal}(\mu = np, \sigma^2 = np(1-p)),$$ and we would calculate using continuity correction $$\Pr[X > 10] \approx \Pr\left[\frac{X - \mu}{\sigma} > \frac{10 - 125(0.09) + 0.5}{\sqrt{125(0.09)(0.91)}} \right] = \Pr[Z > -0.234404].$$ Using a normal table, this is approximately $0.592664$. The quality of the approximation is poor because (a) $p$ is small, and (b) the $z$-score boundary is close to $0$.
But here's what's interesting: if you don't use continuity correction, the $z$-score is $-0.390673$, and the probability is...(drumroll)....
$$0.65198.$$ That is very close to answer choice (C), acceptably so (my calculation was on a computer rather than using a normal table, so it is more precise). Clearly, the writer of the question intended that the solution assumes two things that are not, in my opinion, entirely legitimate assumptions:
- The solution should use the normal approximation to the binomial.
- The normal approximation should be applied without continuity correction.
The issue with the first is minor. It's not totally unreasonable to use a normal approximation, even if a reasonably proficient student can use a calculator to compute a binomial sum of relatively few terms. But the second issue, as we can very clearly see from the exact calculation, is quite major. Failure to use continuity correction magnifies the error of the approximation to an extent that is unacceptable, because the real distribution here is binomial, not normal, and that in a finite, integral sample size of $n = 125$ oranges, $8\%$ is exactly 10 oranges.
I should also add that, in a testing situation with multiple choice options, it is often useful to reason in other ways so as to eliminate incorrect answer choices. Right away, I can tell you that the probability of rejection should be greater than $0.5$. This is because if the true proportion of unsatisfactory oranges is greater than the criterion of $8\%$, the test will be more likely to reject than to not reject. It may not be much greater than $0.5$, but intuitively, it shouldn't be less, and certainly not significantly less.
Best Answer
If you look at Equation 2.3 of that paper, I believe they are describing uniform subsampling. You just draw samples from a data set with uniform probability. This will non-parametrically approximate the underlying distribution, and is often called boostrapping. In this case (Eq.2.3), they're using bootstrapping to approximate the expected value.
See the wiki entry: https://en.wikipedia.org/wiki/Bootstrapping_(statistics)
In python, you can get a uniform sample easily using libraries like numpy: