If $X_1,...,X_n$ and $Y_1,...,Y_n$ are all independent meanzero, variance 1, then
by replacing only the last in the sum and for some three times differentiable
function $f$, by two Taylor expansions centered at $(X_1+...+X_{n-1})/\sqrt n$ you find
$$
|E[f(\frac{X_1+...+X_{n-1} + Y_n}{\sqrt n})] - E[f(\frac{X_1+...+X_{n-1} + X_n}{\sqrt n})] |
\le \frac{\sup_{t\in R} |f'''(t)|}{n^{3/2}} E[|X_n|^3 + |Z_n|^3]
$$
because the zeroth, first, second order terms in the Taylor expansions cancel out
by independence.
Now if $X_1.,..,X_n$ are iid with some common meanzero, variance 1 distribution
and $Z_1,...,Z_n$ are iid $N(0,1)$ and we apply the above replacement to
transform $X_1.,..,X_n$ iteratively to $X_1,...,X_{n-1},Z_n$, then to
$X_1,...,X_{n-2},Z_{n-1},Z_n$, then to $X_1,...,X_{n-3},Z_{n-2},Z_{n-1},Z_n$ etc until having replaced all the $X_i$'s with $Z_i$'s, we accumulate $n$ times the error term above so that
$$
|E[f(\frac{X_1+...+X_{n-1} + X_n}{\sqrt n}) - E[f(Z)] |
\le \frac{\sup_{t\in R} |f'''(t)|}{n^{1/2}} E[|X_1|^3 + |Z_1|^3]
$$
where $Z=(Z_1+...+Z_n)/\sqrt n$ has $N(0,1)$ distribution. At this point you have proved that $|E[f(\frac{X_1+...+X_n}{\sqrt n})] - E[f(Z)]|$ converges to 0 as $n\to+\infty$ when the third moment of $X_1$ is bounded for any function $f$ such that $\sup_{t\in R}|f'''(t)|$ is finite.
This proof does not give you the full generality of the CLT, but is quite approachable early in anyone's probability journey, requriring no measure theory background.
Reference: A User's Guide to Measure Theoretic Probability by David Pollard.
Yes, the CLT only works for the sample mean.
The second diagram you have given is not clear, but the sampling distribution they refer to is the distribution of the sample mean.
If you work out the standard deviation of your sample and repeat it, the values will not be the same as those taken from a normal distribution. In particular, there will be a bias, which we correct for by dividing by $n-1$ instead of $n$ when we use the sample data to estimate the standard deviation of the population.
Best Answer
I am just answering to the example you give, which is straightforward: if $X_n$, $n\in\mathbb N^*$ are i.i.d. with distribution $\mathcal N(\mu,\sigma^2)$, then for all $n\in\mathbb N^*$, $$ \sqrt n\frac{\frac1n(X_1+\cdots+X_n)-\mu}{\sigma}\sim\mathcal N(0,1), $$ so its distribution not only converges to $\mathcal N(0,1)$ but is constant equal to it.