This is a perfectly correct derivation that uses the correspondence principle nicely: we can identify the group velocity with the classical velocity because a classical particle corresponds to a quantum particle whose wavefunction is a sharply peaked wavepacket, whose velocity is the group velocity.
If you want to do it more formally, you can also start from the usual definitions of the group velocity and phase velocity,
$$v_g = \frac{d \omega}{dk}, \quad v_p = \frac{\omega}{k}.$$
The simplest form of the de Broglie relations are
$$E = \hbar \omega, \quad p = \hbar k.$$
Your forms are perfectly right too, but a little more complicated because there are $2\pi$'s all over the place. Next, we know that for a free particle, the energy is
$$E = \frac12 mv^2 = \frac{p^2}{2m}.$$
Combining this with the de Broglie relations, we have
$$\omega = \frac{\hbar k^2}{2m}.$$
Using the definitions of the group and phase velocity
$$v_g = \frac{\hbar k}{m}, \quad v_p = \frac{\hbar k}{2m}.$$
Then $v_g = 2 v_p$ as desired. Incidentally, another way to that the classical velocity is $v = v_g$ is to note
$$p = \hbar k = m v$$
and comparing with our previous expression gives $v = v_g$.
I think you make it sound much more mysterious than it is. The relativistic distribution function is
$$
f_p = \frac{1}{(2\pi)^3}\exp(-(\mu-u\cdot p)/T)\,
$$
where $u_\alpha$ is the 4-velocity of the fluid, $p_\alpha$ is the 4-momentum of the particle, $T$ is temperature, and $\mu$ is the chemical potential. This is sometimes called the Juttner distribution, and it has obvious generalizations to Bose-Einstein and Fermi-Dirac statistics.
This expression is obviously Lorentz-invariant ($u\cdot p$ is a scalar, and so are $\mu$ and $T$), it reduces to Boltzmann in the rest frame of the fluid, and it is Gallilean invariant for slowly moving fluids.
The funny Bessel functions appear if you try to determine the fugacity $e^{\mu/T}$ in terms of the density, $n=N_0$ with
$$
N_\mu = \int \frac{d^3p}{p^0} p_\mu f_p\, ,
$$
because now your normalization factor contains integrals over $\exp(-\sqrt{p^2+m^2}/T)$
Additional Remarks: After some prodding by the OP I looked at the Treumann et al. paper (it is available on the arxive, http://arxiv.org/abs/1105.2120). I initially thought that this was just a needlessly complicated rederivation of well-known results, but this is not the case. The paper is just wrong. (Frankly, it is not a good sign if a paper that points out a major flaw in relativistic kinetic theory is published in the physics section of the arxive, and a geophysics journal.)
The paper starts out o.k., observing that since $d^3xd^3p$ is Lorentz invariant, $f_p$ must be a scalar. However, they write down a distribution function which is clearly not a scalar unless I assume that $T$ is the zeroth component of a vector. Even if that could be arranged, the result is nor right because it is not Galilean invariant for small velocities. He then writes down a non-covariant expression for $N_\mu$. If $f_p$ is a scalar, $d^3p p_\mu f_p$ is not a vector, and as a result his particle density $N_0$ does not transform as a density (I wrote the correct expression above). Since $N_0$ is wrong, the normalization $\Lambda$ is also wrong.
More Remarks: After more prodding, an attempt to answer the original questions:
1) The original Boltzmann distribution is anisotropic if the fluid velocity $\vec{u}$ is not zero (the distribution is isotropic in the fluid rest frame), but presumably you are looking for something more general. The distribution function of a viscous fluid is anisotropic already in the rest frame of the fluid, $f_p=f_p^0 +\delta f_p$ with $$\delta f_p \sim \eta\sigma_{\alpha\beta}p^\alpha p^\beta,$$ where $\sigma_{\alpha\beta}$ is the relativistic strain tensor. Even more generally, we can parameterize the distribution function in terms of the currents and the stresses. This is the relativistic version of Grad's 13 moment method, see, for example, http://arxiv.org/abs/1301.2912. People have also written down models for highly anisotropic distribution functions, see for example http://arxiv.org/abs/1007.0130 .
2) The temperature is a scalar. It is related to the kinetic energy per particle in the rest frame. The kinetic energy density is the 00 component of a tensor, and transforms accordingly.
Best Answer
When you say "velocity" and "phase velocity", I assume you mean "group velocity" and "phase velocity" of matter waves, where $$v_g = \frac{d\omega}{dk} = \frac{dE}{dp}, \quad v_p = \frac{\omega}{k} = \frac{E}{p}$$ where the second equalities are from the de Broglie relations. For a relativistic particle, $$E^2 = p^2 + m^2$$ which implies that $$E \, dE = p \, dp$$ or equivalently, $$v_p v_g = 1$$ which is the first equation you cite.
This equation is correct, but as you notice, the nonrelativistic limit is more subtle. The reason is that in nonrelativistic physics, we don't work with the full energy $E$ above, but instead subtract out the rest mass energy, giving $T = E - m$. This subtraction changes the phase velocity, $$v_g' = \frac{dT}{dp} = v_g, \quad v_p' = \frac{T}{p} \neq v_p.$$ This is why most modern textbooks don't talk much about the phase velocity of matter waves; it changes under things like shifts in energy, so it's not a very useful idea.
In any case, we have $$(T+m)^2 = p^2 + m^2$$ which implies that $$(T+m) \, dT = p \, dp$$ from which we conclude $$(v_p' + m/p) \, v_g = (v_p' + 1/\gamma v_g) v_g = 1$$ which simplifies to give $$v_p' v_g = 1 - \frac{1}{\gamma}$$ in agreement with your second equation. So both equations are right. They're just using different definitions of the phase velocity, coming from different definitions of the energy.