I have met the same problem reading that. I did some research and found this paper helpful: https://doi.org/10.1063/1.2889939
The authors of this paper argue that for microcanonical ensemble, it is the Boltzmann-Planck definition $S=k\ln\omega$ that should be used, where $\omega$ is the number of microstates with total energy being held strictly fixed at $E$. The equally division of kinetic energy is still valid:
$$\langle\mathbf{p}_i\cdot\nabla_{\mathbf{p}_j}H\rangle = \delta_{ij}2\frac{\langle K\rangle}{N} = \delta_{ij}\frac{2}{N}\langle\sum_{i=1}^N\frac{p_i^2}{2m_i}\rangle$$
But because of the use of $\omega$ instead of $\Omega$, now $\langle K \rangle$ has no direct relation to the temperature. The temperature is related to the kinetic energy by:
$$\left(\frac{3N}{2}-1\right)kT=\left[\langle K^{-1}\rangle\right]^{-1}$$
This implies that if we want to calculate the temperature of a microcanonical ensemble, we should calculate the ensemble average of the inverse of the kinetic energy.
The result above can be derived as follows. We assume that the Hamiltonian can be expressed as
$$H = \sum_{i=1}^N \frac{\mathbf{p}_i^2}{2m_i} + U(\mathbf{r}_1\ldots,\mathbf{r}_N)$$
The number of microstates in a microcanonical ensemble is
$$\omega(N,V,E) = \frac{\Delta E}{g} \int\delta(E-H) d^{3N}rd^{3N}p$$
So the temperature can be calculated as
$$\frac{1}{kT} = \frac{\partial\ln\omega}{\partial E} = \frac{1}{\omega} \frac{\Delta E}{g} \int\delta^\prime(E-H) d^{3N}r d^{3N}p$$
To evaluate that, we can perform the Laplace transform on it ($E$ to $s$):
\begin{aligned}
\mathcal{L}\left[\frac{\partial\ln\omega}{\partial E}\right] &= \frac{1}{\omega} \frac{\Delta E}{g} \int\int\delta^\prime(E-H)e^{-sE}dE\ d^{3N}r d^{3N}p
\\&= \frac{1}{\omega} \frac{\Delta E}{g} \int se^{-sH}\ d^{3N}r d^{3N}p
\\&= \frac{1}{\omega} \frac{\Delta E}{g} \int s\exp[ -s (\sum_{i=1}^N \frac{\mathbf{p}_i^2}{2m_i} + U)] d^{3N}rd^{3N}p
\\&= \frac{1}{\omega} \frac{\Delta E}{g}(2\pi)^{3N/2} \prod_{i=1}^Nm_i^{3/2} s^{-(3N/2-1)} \int e^{-sU}d^{3N}r
\end{aligned}
By inverse Laplace transform we can get
$$ \frac{1}{kT} = \frac{1}{\omega} \frac{\Delta E}{g} (2\pi)^{3N/2} \prod_{i=1}^Nm_i^{3/2} \int \frac{(E-U)^{3N/2-2}}{\Gamma(3N/2-1)} \Theta(E-U) d^{3N}r$$
where $\Theta$ is the Heaviside step function. On the other hand, the ensemble average of the inverse of kinetic energy is
$$\langle K^{-1}\rangle = \langle (E-U)^{-1}\rangle = \frac{1}{\omega} \frac{\Delta E}{g} \int \frac{\delta(E-H)}{E-U} d^{3N}pd^{3N}r$$
The Laplace transform of this is
\begin{aligned}
\mathcal{L}\left[\langle K^{-1}\rangle\right] &= \frac{1}{\omega} \frac{\Delta E}{g} \int \frac{e^{-sH}}{H-U}d^{3N}pd^{3N}r
\\&= \frac{1}{\omega} \frac{\Delta E}{g} \int \frac{ \exp(-s\sum_{i=1}^N \frac{p_i^2}{2m_i}) }{ \sum_{i=1}^N \frac{p_i^2}{2m_i} } d^{3N}p \int \exp(-sU) d^{3N}r
\end{aligned}
To calculate the integral over $p$, we can make a coordinate transform $p^\prime_i = p_i/\sqrt{2m_i}$, and let $\xi^2 = \sum {p^\prime}^2$:
\begin{aligned}
\mathcal{L}\left[\langle K^{-1}\rangle\right] &= \frac{1}{\omega} \frac{\Delta E}{g} 2^{3N/2} \prod_{i=1}^Nm_i^{3/2} \int \frac{e^{-s\xi^2}}{\xi^2} d^{3N}p^\prime \int e^{-sU} d^{3N}r
\\&=\frac{1}{\omega} \frac{\Delta E}{g} 2^{3N/2} \prod_{i=1}^Nm_i^{3/2} \int_0^\infty \frac{e^{-s\xi^2}}{\xi^2} A(\xi)d\xi \int e^{-sU} d^{3N}r
\end{aligned}
where $A(\xi)$ is the surface of a 3N-dimension sphere with radius $\xi$ and is equal to
$$ A(\xi) = \frac{2\pi^{3N/2}}{\Gamma(3N/2)}\xi^{3N-1}$$
Substitution of this into the integration above yields
$$\mathcal{L}\left[\langle K^{-1}\rangle\right] = \frac{1}{\omega} \frac{\Delta E}{g} (2\pi)^{3N/2} \prod_{i=1}^N m_i^{3/2} \frac{1}{3N/2-1} \int \frac{e^{sU}}{s^{3N/2-1}} d^{3N}r$$
By inverse Laplace transform, we can finally get
$$\langle K^{-1} \rangle = \frac{1}{\omega} \frac{\Delta E}{g} (2\pi)^{3N/2} \prod_{i=1}^N m_i^{3/2} \frac{1}{3N/2-1} \int \frac{(E-U)^{3N/2-2}}{\Gamma(3N/2-1)} \Theta(E-U) d^{3N}r$$
By comparing this with the result of $1/kT$, we can easily find
$$\left(\frac{3N}{2}-1\right)kT=\left[\langle K^{-1}\rangle\right]^{-1}$$
So using different definitions of entropy yields different results. The authors elaborated on why they believe $S=k\ln\omega$ is the right one to use. They calculated the distribution of speeds using both $\omega$ and $\Omega$ and demonstrated that only the former is consistent with their MD simulation of 10 hard-sphere particles. Since the entropy can also be seen as an ensemble average ($S=-k\langle \ln\rho\rangle$, $\rho$ is the probability density of a microstate), it is reasonable to do it in the phase space $\omega$ too. Also, they found that $\langle K\rangle_\Omega=\sum\langle p_i^2\rangle_\Omega / 2m_i \not= \langle E-U\rangle_\Omega$ and this result is considered as not physical. Those who approve the use of $S=k\ln\Omega$ often cite the result that $\Omega$ is adiabatically invariant, but this reason is insufficient in the authors' opinion.
The authors also derived the result for $NVE\mathbf{PG}$ ensemble in which total momentum $\mathbf{P}$ and $\mathbf{G}=t\mathbf{P} - M_{total}r_c$ are also constant. This is important for molecular dynamics simulation because most of the time we do the "$NVE$ ensemble MD", we only simulate once and then calculate the time average. If there is no external force, what we get is actually a $NVE\mathbf{PG}$ ensemble. The results are slightly different:
$$\langle\frac{\mathbf{p}_i^2}{2m_i}\rangle = \frac{M_{totle}-m_i}{M_{total}(N-1)}\langle E-U\rangle + \frac{Nm_i-M_{total}}{2(N-1)M_{total}^2}\mathbf{P}^2$$
$$\left(\frac{3(N-1)}{2}-1\right)kT=\left[\left\langle \left(E-U-\frac{\mathbf{P}^2}{2M_{total}}\right) ^ {-1} \right\rangle\right]^{-1}$$
In the thermodynamic limit where $N$ is large, the result of the equipartition theorem and the results derived in this paper for $NVE$ and $NVE\mathbf{PG}$ ensemble become equivalent. So in that case it's still safe to use the equipartition theorem to calculate the temperature in $NVE$ MD simulations.
Best Answer
J. A. S. Lima and A. R. Plastino published the article On the classical energy equipartition theorem back in 1999. In their article they derive a generalized equipartition theorem. Their generalized approach is valid for systems with arbitrary distribution functions and for systems with non-quadratic terms in the Hamiltionian. A link to their article is https://doi.org/10.1590/S0103-97332000000100019. This article should answer most of your questions.
Their article can be summarized as follows:
Suppose there is a system with $f$ degrees of freedom and the Hamiltonian is \begin{equation} \mathcal{H} = g(x_1,...,x_L) + h. \end{equation} $(x_1,...,x_L)$ is a subset of the phase-space coordinates for positions and momentas. $g$ is homogeneous such that
\begin{equation} g\left(\lambda x_{1}, \ldots, \lambda x_{L}\right)=\lambda^{r} g\left(x_{1}, \ldots, x_{L}\right). \end{equation} For systems that are distributed according to a Boltzmann distribution you can derive the expression
\begin{equation} \langle g\rangle=\frac{L}{r} k_B T. \end{equation} Here $k_B$ is the Boltzmann constant and $T$ is the temperature. This is the generalized form of the equipartition theorem that you are looking for. For example, for a monovalent ideal gas you have $r=2$ and $L=3N$ such that you recover the famous result $U = \frac{3}{2}N k_B T$ where U is the internal energy of the gas.