I like to think about it as described in this article by Shankar. In counterterm renormalization, you are essentially studying Wilsonian renormalization, but with the intention of sending the cutoff to infinity at the end (as is necessary to have a continuum definition of a field theory). The UV "divergences" that show up in perturbation theory are the result of taking this limit sloppily. Counterterms are a mathematical construct that allow you to take this limit more carefully. The fact that the counterterms can be different depending on renormalization scheme, but the physics must be independent of the choice of scheme, is what leads to the idea of scale dependence and running couplings in the counterterm picture (this is why we are free to use MS, MS bar, or on shell, and not worry about conflicts in predictions). From this point of view, in counterterm renormalization, we integrate out the modes above some cutoff, which (as we know from the Wilsonian picture) gives finite results for the couplings as functions of the cutoff. We then use this result to define the continuum limit of the field theory, by pushing the cutoff to infinity, equivalent to taking it as much larger than any scales of interest. This leaves us with counterterms capturing the limiting behavior of the cutoff dependence necessary for the limit to exist, and an arbitrary mass scale capturing what the Wilsonian picture told us about the scaling behavior of the theory. For the other part of your question, yes, Wilsonian renormalization can be done loopwise. By symmetry, the Feynman diagrams which show up in perturbative field theoretic calculations will be identical to the Feynman diagrams generating corrections to the couplings from the fast modes, with the only difference being that the integrals are now convergent, taken over the momentum shell from the scaled down cutoff to the original cutoff.
I would like to add one additional note: the leftover information from Wilsonian renormalization of a perturbatively renormalizable field theory after the continuum limit has been taken can also be viewed as the information telling us about the anomaly of scale invariance. From this point of view, Wilsonian renormalization is a more careful way of handling how scale invariance behaves at the quantum level, by studying the proper way to apply a scale transformation to the path integral. This is analogous to how Fujikawa computed the chiral anomaly; he looked at the fact that the path integral measure broke the symmetry. In this case, it's not as "obvious" as the measure breaking the symmetry, it's the loop corrections which appear in the Wilsonian effective action which break the symmetry. The beta functions tell us exactly how it is broken (or preserved, in very special cases!).
Hope this helped.
Edit: It has been some time since I wrote the original answer, and I am not completely satisfied with it (some issues regarding these concepts were brought up again to me recently). The description here is, of course, very rough. The Wilsonian point of view gives us a philosophy to understand effective field theories and why renormalization works, but in involved calculations the old way is preferred. Ideas from the "Exact Renormalization Group" are the closest thing in existence to making the connection between the points of view precise, but even in those cases the computations in the Wilson scheme are simply intractable. At the end of the day, it comes down to an issue of philosophy and pragmatism. The best choice is to use counterterm renormalization to compute, and the Wilson view to interpret.
Best Answer
I think it's quite easy to prove with functional integrals. In Wilson renormalization one splits the field in low energy modes and high energy ones, say $\phi = \varphi + \Phi$, where $\varphi$ only has support on small momenta and $\Phi$ on large ones. The action will be $S[\varphi,\Phi] = S_1[\varphi] + S_2[\Phi] + S_\text{int}[\varphi,\Phi]$.
Imagine now that the full theory (high energy) has a symmetry under $\varphi \to \tilde\varphi[\varphi]$ and $\Phi \to \tilde\Phi[\Phi]$. This means that the action is invariant: $S[\tilde\varphi,\tilde\Phi]=S[\varphi,\Phi]$. Note also that I've used the fact that internal symmetries do not mix low energy modes with high energy ones. EDIT: To understand this one can work, for example, in momentum space. The low and high energy modes are defined such that
$$ \varphi(k) = \begin{cases} \phi(k) \quad \text{ for } k \leq \Lambda \\ 0 \quad \quad \,\text{ otherwise}\end{cases}\,, \qquad \Phi(k) = \begin{cases} 0 \quad\quad\,\, \text{ for } k \leq \Lambda \\ \phi(k) \quad \,\text{ otherwise}\end{cases}\,. $$ Since internal symmetries do not act on the momenta, but only on the indices of the fields, the transformed of, for example, $\varphi(k)$ will still have only support on $k\leq \Lambda$, i.e. it will still be a low energy mode. END OF EDIT
Now, the effective action obtained integrating over high energy modes, is defined as
$$ e^{iS_\text{eff}[\varphi]} = e^{iS_1[\varphi]}\int \mathcal{D}\Phi \, \exp \left( iS_2[\Phi]+iS_\text{int}[\varphi,\Phi] \right)\,. $$
Now, perform a symmetry transformation on $\varphi$,
\begin{align} e^{i S_\text{eff}[\tilde\varphi]} &= e^{iS_1[\tilde \varphi]} \int \mathcal{D}\Phi \exp\left( iS_2[\Phi] + i S_\text{int}[\tilde\varphi,\Phi]\right) \\ &= e^{iS_1[\tilde \varphi]} \int \mathcal{D}\tilde\Phi \exp\left( iS_2[\tilde\Phi] + i S_\text{int}[\tilde\varphi,\tilde\Phi]\right) \\ &= e^{iS_1[\varphi]} \int \mathcal{D}\Phi \exp\left( iS_2[\Phi] + i S_\text{int}[\varphi,\Phi]\right)= e^{iS_\text{eff}[\varphi]}\,, \end{align}
where in the second equality I've just performed a change of integration variable $\Phi\to\tilde\Phi$, and in the third I've used the fact that the action is invariant and the symmetry is not anomalous (i.e. $\mathcal{D}\Phi = \mathcal{D}\tilde\Phi$).
Therefore, if the original theory is invariant, the effective low energy one is invariant as well, i.e. no symmetry breaking terms can be generated in the renormalization procedure.
I hope this answers your question!