The key assumption is that the signal is band-limited. This is a frequency-domain assumption. Any sensible proof must go through the frequency domain.
Same goes for the proof being an approximation argument. Any real proof must be an approximation argument, including the one you alluded to, or its rigorous version.
Here is another Hilbert space argument that, in the end, gives us approximation in the topology of uniform convergence (much better than $L^1$ or $L^2$):
Let $H$ be the set of band-limited elements of $L^2(\mathbb{R})$ (no extra assumptions). By unitarity of the Fourier transform $\mathcal{F}$, $H$ is a Hilbert subspace of $L^2$. Since $L^2$ elements of compact support lie in $L^1$, the Fourier inversion theorem implies that elements of $H$ are in fact continuous almost everywhere. So band-limited assumption implies (for now) continuity and therefore sampling make sense.
$\mathcal{F}(H)$ has orthonormal basis $\{ e^{- 2 \pi i k\frac{\xi}{2T}} \}_{k \in \mathbb{Z}}$. So now it's natural to compute the Hilbert space expansion of $\hat{f}$ in this basis then apply $\mathcal{F}^{-1}$. By unitarity
$$
\langle \hat{f}, e^{- 2 \pi i k\frac{\xi}{2T}} \rangle = \langle f, \delta_{\frac{k}{2T}}\rangle = f(\frac{k}{2T}).
$$
Strictly speaking, one needs a rigged Hilbert space that includes distributions to make sense of inner products with delta functions but everything works out. On the other hand, the inverse Fourier transform of the basis $\{ e^{- 2 \pi i k\frac{\xi}{2T}} \}_{k \in \mathbb{Z}}$ are just shifts of the $\mbox{sinc}$ function. So we have that Shannon's sampling formula holds in the $L^2$-sense.
To strengthen the convergence, notice $L^2$-convergence (in the frequency domain) implies $L^1$-convergence by the band-limited assumption. By property of $\mathcal{F}^{-1}$, back in the time domain we have uniform convergence.
Since $\mbox{sinc}$ functions and its shifts are all smooth, we can actually conclude that a band-limited $L^2$ function is in fact smooth almost everywhere.
The answer is in the first post you linked to:
To downsample, it transforms to the frequency domain and deletes the second and third groups of $N/4$ elements (which correspond to the half with the highest frequency components).
So apparently you confused the middle half of the elements as stored in memory with the middle half of the frequencies.
Best Answer
Well, have a look at the statement of the theorem - it assumes that the signal is band-limited i.e. it has finite frequency content, so the first integral over $(-\infty,\infty)$ reduces to a finite integral over $[-2\pi W,2\pi W]$ because the signal contains no frequencies larger than $W$. Intuitively, you need this assumption because if we have infinite frequency content, then your signal may vary arbitrarily fast - in which case you might need to sample your signal at an arbitrarily high frequency to obtain a perfect reconstruction.
Finally, substituting $t$ with $\frac{n}{2W}$ is actually a trick to see that the LHS ($f(\frac{n}{2W})$ i.e. $f$ sampled at a rate $\ge 2W$, twice the largest frequency in the signal) actually gives you the Fourier coefficients of the signal, which is the RHS (note that by assumption, we only need angular frequencies in the range $[-2\pi W,2\pi W]$, corresponding to frequencies in the range $[-W,W]$). And since the Fourier transform (if it exists) uniquely determines the signal, we conclude that it must be possible to reconstruct the signal - and Shannon's interpolating formula gives an explicit way to do this.