If $X$ represents here the sample mean $\bar X_n$, then the Central Limit Theorem says that the quantity
$$Z = \sqrt n(\bar X_n-\mu)$$ tends in distribution to $N(0,\sigma^2)$ as $n$ tends to infinity, and then by abusing notation and asymptotics, we write
$$ \bar X_n = \frac{1}{\sqrt n}Z + \mu$$ which gives us that $\bar X_n \approx N(\mu,( \frac{\sigma}{\sqrt n})^2) $.
...which in a sense holds for some "intermediate range" of $n$ - because if $n$ truly passes over to infinity, then the distribution collapses to a single point, since the variance goes to zero (which is as it should).
The notion of confidence interval is somewhat intuitive, but that may be keeping you from understanding what it means in more depth.
Say I have multiple samples $x_i$ from a population, and I wish to estimate the population mean $\mu$. A CI of, say, 95\% represents an interval of possible values of $\mu$ such that given my samples, the "probability" that the $\mu$ lies in that interval is 95%.
We immediately see that there can be more than one such interval, since I could trade probability past the upper end for probability at the lower end of the interval, thus shifting the interval. Let's skirt that issue by demanding a symmetric interval about my sample mean.
But the "probability" is not well defined from the information I just presented!
In order to assign a probability, I have to make some assumptions about the population. The usual assumption is that the population variance is equal to the unbiased estimator of variance obtained from our sample. But we still have things backward: We can't honestly talk about the probability of the population mean being in some range, without any assumption about the a priori (before I saw my samples) probabilities of the mean being various values.
So we apply the usual sleight-of-mind logic employed by the frequentist point of view. We ask:
Given that the population variance is our unbiased sample variance estimate, What are the highest and lowest values of the population mean $\mu$ such that the chance of or sample being as far away from $\mu$ as it is, is lower than 100%-95% = 5%.
Now let's go back to your problem. Since the population is finite, as you draw more samples (without replacement) you actually do learn something about the population. If you had drawn all the objects but one, if you take your sample unbiased variance as the population variance, your 95% confidence interval for the value of that remaining one object would be roughly $2\sigma$ but your estimate of the population mean will have a variance of $\sigma/N$. This is quite a bit smaller than would be the case for an infinite population or a small sampling of a large population.
Now when you draw that last sample, you know everything about the distribution. In particular, you know the mean exactly. Therefore any interval that includes the actual mean is a 100% CI. If you then say that the real CI is the tightest such interval, then it has width zero.
Best Answer
If $X$ is a normal random variable, you can record an observation of it, $x$, and compare it to the mean. The usual way to do this is to standardize the variable, i.e.,
$$z = \frac{x - \mu}{\sigma}$$
Let's say that $X_1, X_2, \ldots X_n$ are random variables from the same distribution as $X$ above. If we record observations of each and calculate the mean, that's also a random variable. However, we can't expect the mean, $\overline{X}$, our new random variable to have the same distribution as our original distribution. It will have the same mean, but it won't have the same variance.
Think of it this way: Make $n$ larger and larger — record more and more observations. It seems that after a while, the mean of all those observations will be the same as the mean from the population. To make it a bit more concrete: Flip a coin a few times, letting $X = 1$ for heads and $X = 0$ for tails. Will your mean be $0.5$? Probably not. Flip it a few more times. Maybe you're a bit closer to $0.5$. By the time you flip the coin, say, a few thousand times, you'll probably be very close to $0.5$.
In other words, when we record a sample mean, making many observations, in the long term, restricts how far we can stray from the true mean. This is shown by the fact that
$$\text{Var}(\overline{X}) = \frac{\sigma^2}{n}$$
Note that as $n \rightarrow \infty$, $\text{Var}(\overline{X}) \rightarrow 0$. This is the gist of the Central Limit Theorem.
The Central Limit Theorem tells us that, regardless of the distribution of a random variable, as we take larger and larger samples, the distribution of the sampling mean (i.e., of $\overline{X}$) is normally distributed, and so we can use all the convenient properties of the normal distribution (like the standardized form). So when we write
$$z = \frac{\overline{x}-\mu}{\sigma / \sqrt{n}}$$
it's really the same thing as the earlier $z$: It's the difference of an observation and an expected value divided by the standard deviation of whatever distribution the observation came from. The new standard deviation (the standard error) is derived from the old one, but that's because the new distribution is derived from the old one.
Remember, $\sigma^2$ stands for population variance. So regardless of whether we're looking at a sample of size n or just one observation, it's always the population variance. $σ^2/n$ is the variance of the sample mean in terms of the population variance. (And, of course, $σ/\sqrt{n}$ is the square root of the variance of the sample mean.)