I think you might be confusing the expected sampling distribution of a mean (which we would calculate based on a single sample) with the (usually hypothetical) process of simulating what would happen if we did repeatedly sample from the same population multiple times.
For any given sample size (even n = 2) we would say that the sample mean (from the two people) estimates the population mean. But the estimation accuracy -- that is, how good a job we've done of estimating the population mean based on our sample data, as reflected in the standard error of the mean -- will be poorer than if we had a 20 or 200 people in our sample. This is relatively intuitive (larger samples give better estimation accuracy).
We would then use the standard error to calculate a confidence interval, which (in this case) is based around the Normal distribution (we'd probably use the t-distribution in small samples since the standard deviation of the population is often underestimated in a small sample, leading to overly optimistic standard errors.)
In answer to your last question, no we don't always need a Normally distributed population to apply these estimation methods -- the central limit theorem indicates that the sampling distribution of a mean (estimated, again, from a single sample) will tend to follow a normal distribution even when the underlying population has a non-Normal distribution. This is usually appropriate for "bigger" sample sizes.
Having said that, when you have a non-Normal population that you're sampling from, the mean might not be an appropriate summary statistic, even if the sampling distribution for that mean could be considered reliable.
Suppose $X$ is Poisson with parameter $\lambda$, and $Y$ is normal with mean and variance $\lambda$. It seems to me that the appropriate comparison is between $\Pr(X = n)$ and $\Pr(Y \in [n-\frac12,n+\frac12])$. Here for simplicity I write $n = \lambda + \alpha \sqrt\lambda$, that is, we are interested when $n$ corresponds to $\alpha$ standard deviations from the mean.
So I cheated. I used Mathematica. So both $\Pr(X = n)$ and $\Pr(Y \in [n-\frac12,n+\frac12])$ are asymptotic to
$$ \frac 1{\sqrt{2\pi \lambda}} e^{-\alpha^2/2} $$
as $\lambda \to \infty$. But their difference is asymptotic to
$$ \frac{\alpha \left(\alpha ^2-3\right) e^{-\alpha ^2/{2}}}{6 \sqrt{2
\pi } \lambda } $$
If you plot this as a function of $\alpha$, you will get the same curve as is shown in the second to last figure in http://www.johndcook.com/blog/normal_approx_to_poisson/.
Here are the commands I used:
n = lambda + alpha Sqrt[lambda];
p1 = Exp[-lambda] lambda^n/n!;
p2 = Integrate[1/Sqrt[2 Pi]/Sqrt[lambda] Exp[-(x-lambda)^2/2/lambda], {x, n-1/2, n+1/2}];
Series[p1, {lambda, Infinity, 1}]
Series[p2, {lambda, Infinity, 1}]
Also, with a bit of experimentation, it seems to me that a better asymptotic approximation to $\Pr(X = n)$ is $\Pr(Y \in [n-\alpha^2/6,n+1-\alpha^2/6])$. Then the error is
$$ -\frac{\left(5 \alpha ^4-9 \alpha ^2-6\right) e^{-{\alpha ^2}/{2}}
}{72 \sqrt{2 \pi } \lambda ^{3/2} } $$
which is about $\sqrt\lambda$ times smaller.
Best Answer
Update: As @whuber has pointed out with his comments, a better way to look at this is by computing the true coverage probabilities for the Poisson. The simulation, while also with its uses, does not reveal the interesting pattern seen in the plot below.
This was based on @whuber's code (see his first comment on this answer):
What this does: If $X \sim \textrm{Pois}(\lambda)$ then $E(X) = \lambda$ and $V(X) = \lambda$. This means that the interval in question is $I := (\lambda - \sqrt \lambda, \lambda + \sqrt \lambda)$. The function
f
computes $$ \mathbb P_\lambda(X \in I) = F_X(\lambda + \sqrt \lambda; \lambda) - F_X(\lambda - \sqrt \lambda; \lambda) $$ where in R $F_X(t; \lambda)$ is obtained via theppois
function.Original answer: This is in no way a categorical answer but I thought you might like to see a simulation. Note that I'm using samples of size $n = 20000$ because you didn't mention that you cared about the sample size, so I wanted each sample to reflect asymptotic properties.
The simulation shows that the Poisson random variables (RVs) do not behave indistinguishably from the Normal RVs until around $\lambda \approx 1000$ with respect to the coverage rate and this choice of $n$. We can also see the variation in the coverage of random samples of Normal RVs even though they all exactly have the property that we are investigating at the population level. Note that in this simulation I compared a random sample to its sample mean and sample standard deviation rather than the population mean and population standard deviation. I chose to do so because I felt this to be more interesting for a discussion about the distribution of a statistic calculated from a sample.
Here's the code to make the plot.