First, the Thomas-Fermi screening is a semiclassical static theory which assumes that the total potential $\phi(\mathbf{r})$ varies slowly in the scale of the Fermi length $l_{\text{F}}$, the chemical potential $\mu$ is constant and that $T$ is low. In principle, it does not rely on linear response theory.
The condition of slowly varying potential is a general condition of validity of semiclassical models. Physically, if the particle [electron] is represented by a wave packet, what is tellying us is that all the waves in the wavepacket will see the same potential and the particle will suffer [or enjoy!] a force as if it was point-like ["classical"] because such potentials gives rise to ordinary forces in the equation of motion describing the evolution of the position and wavevector of the packet. The wavepacket must have a well-defined wavevector on scale of the Brillouin zone [thus $\Delta k \simeq k_{\text{F}}$] and therefore can be spread in the real space over many primitive cells.
Mathematically, the assumption that your potential is a slowly varying function of the position implies that the theory is not valid for $|\mathbf{q}| \gg k_{\text{F}}$ [and therefore for $|\mathbf{r}| \ll l_{\text{F}}$].
On the other hand, the static Lindhard dielectric function is a fully quantum treatment of the problem and it is valid for all the ranges of $\mathbf{q}$. It includes, in the limit $\mathbf{q} \rightarrow 0$, the linearized Thomas-Fermi dielectric function. It only assumes linear response, that is, the induced density of charge is proportional to the total potential $\phi(\mathbf{r})$.
Note also that the Lindhard treatment is far more general than the Thomas-Fermi in the sense that it can describe both dynamic and static screening.
The deep insight of Anderson is that the difference between insulators and conductors is not the energy spectrum. In fact the entire picture we are taught in introductory courses is highly misleading. [Note: Everything I am going to talk about will be about single particle effects, so no interaction.]
First lets just remember the introductory picture. We have a perfect crystal, so we get energy bands. We fill those bands up with electrons. In the case when a band is partially filled we get a conductor. In the case when all of our bands are completely occupied, so that the Fermi level lies in the gap, we get an insulator.
Now that problems: finite conductivity is entirely dependent on impurities. In the absence of impurities momentum is completely conserved. If I give the carriers any momentum, they will never lose it. Therefore a finite current can never dissipate, which is the same as saying the resistance is zero. Since there will all always be some carriers at any non-zero temperature, in the absence of impurities all materials will be "perfect conductors".
So it is clear that to make any sense we need to add impurities. However if we add impurities the nice energy band picture disappears. Since we just added random stuff to our Hamiltonian there is no reason we shouldn't be able to to find a state of any energy if we look hard enough. Obviously there will be more states in what used to be the bands, but there will also be states in the gap. In short the bands will blur together.
But if the bands blur together then there is no longer any notion of a gap - so what could possibly separate insulators and conductors? It is not the electronic energy spectrum, it is the electronic wavefunction themselves. Since there is no longer translational symmetry these are not restricted to the Bloch form. There are two main possibilities:
1) The wavefunctions near the Fermi level are extended, i.e. their magnitude is roughly constant over the entire system, like a plane wave. This is a conductor.
2) The wavefunctions near the Fermi level are localized, i.e. their magnitude decays roughly exponentially as you go out from some point. This is an insulator.
This is what actually distinguishes insulators and conductors. Going back to the band gap classification of materials - why does it basically work? The reason is if one adds disorder to a perfect crystal, the states that are added in the gap and near the band edges are usually localized states, so thinking about the gaps leads to the correct answer. But this is not the direct physical mechanism.
Best Answer
I feel a bit ashamed, as I found the solution of this apparent contradiction. In my previous intuition I took the charge density constant, which is wrong...
Assuming that we apply an electric field on a metal along the direction $x$, then one has $\phi(x) = \phi_0e^{-k_0 x}$ inside the metal. One can compute the total induced charge $$ \begin{aligned} Q^{\rm ind} &= S\int_0^{\infty}\rho^{\rm ind}(x){\rm d}x\\ &= S\int_0^{\infty}-e^2\frac{\partial n}{\partial \mu}\phi_0e^{-k_0 x} {\rm d}x \\ &= S \phi_0 /4 \pi \times \int_0^{\infty} k_0^2e^{-k_0 x}{\rm d}x \\ &= k_0 \phi_0 S / 4 \pi \\ &= C_{\rm Q} \phi_0 \end{aligned} $$ the capacitance $C_{\rm Q} = \epsilon_0 k_0 S $ (in S.I units now) is the quantum capacitance for the metallic interface. Therefore a better screening (higher $k_0$) also implies a higher total number of electrons $Q^{\rm ind}/(-e)$ affected by the potential, although the length on which these are spread is smaller.
Anyone corrects me if I'm wrong !