What does it mean to say that a microcanonical ensemble is one that maximizes entropy? With respect to what variable?
There are different entropies involved: thermodynamic(function of $E,V,N$) and information-theoretical (function of probabilities, also sometimes called the Gibbs entropy). These are not the same thing.
The general rule is: the thermodynamic entropy for some macroscopic state X is the maximum possible value of information entropy functional for all probability distributions that obey given constraints implied by state X.
The concept of microcanonical ensemble can be understood as "the best" probability description of a system for these constraints: known and fixed energy, volume and number of particles: $E,V,N$.
Information entropy is a function of set of probabilities for all possible states. If the space of states is discrete and has $N$ distinct states, the information entropy functional can be defined as
$$
I[{p_k}] = - \sum_{k=1}^N p_k \ln p_k.
$$
(this function was studied by Claude Shannon and Edwin Jaynes who gave some arguments why this function and not some other).
For a system with known and fixed $E,V,N$, all the states in the sum above must be possible, in other words, all states $k$ have the same energy, volume, number of particles. Other states are not included in the sum.
Different probability assignments for those states, while all compatible with constraints implied by $E,V,N$ , may give different values for $I$. The distribution $p_k^* = 1/N$ gives, for large $N$, the maximum possible value for $I$. This maximum possible value then gives thermodynamic entropy of the system in macroscopic state $E,V,N$, according to formula
$$
S(E,V,N) = k_B I [ \{ p_k^*\} ] = k_B \ln N.
$$
I have met the same problem reading that. I did some research and found this paper helpful: https://doi.org/10.1063/1.2889939
The authors of this paper argue that for microcanonical ensemble, it is the Boltzmann-Planck definition $S=k\ln\omega$ that should be used, where $\omega$ is the number of microstates with total energy being held strictly fixed at $E$. The equally division of kinetic energy is still valid:
$$\langle\mathbf{p}_i\cdot\nabla_{\mathbf{p}_j}H\rangle = \delta_{ij}2\frac{\langle K\rangle}{N} = \delta_{ij}\frac{2}{N}\langle\sum_{i=1}^N\frac{p_i^2}{2m_i}\rangle$$
But because of the use of $\omega$ instead of $\Omega$, now $\langle K \rangle$ has no direct relation to the temperature. The temperature is related to the kinetic energy by:
$$\left(\frac{3N}{2}-1\right)kT=\left[\langle K^{-1}\rangle\right]^{-1}$$
This implies that if we want to calculate the temperature of a microcanonical ensemble, we should calculate the ensemble average of the inverse of the kinetic energy.
The result above can be derived as follows. We assume that the Hamiltonian can be expressed as
$$H = \sum_{i=1}^N \frac{\mathbf{p}_i^2}{2m_i} + U(\mathbf{r}_1\ldots,\mathbf{r}_N)$$
The number of microstates in a microcanonical ensemble is
$$\omega(N,V,E) = \frac{\Delta E}{g} \int\delta(E-H) d^{3N}rd^{3N}p$$
So the temperature can be calculated as
$$\frac{1}{kT} = \frac{\partial\ln\omega}{\partial E} = \frac{1}{\omega} \frac{\Delta E}{g} \int\delta^\prime(E-H) d^{3N}r d^{3N}p$$
To evaluate that, we can perform the Laplace transform on it ($E$ to $s$):
\begin{aligned}
\mathcal{L}\left[\frac{\partial\ln\omega}{\partial E}\right] &= \frac{1}{\omega} \frac{\Delta E}{g} \int\int\delta^\prime(E-H)e^{-sE}dE\ d^{3N}r d^{3N}p
\\&= \frac{1}{\omega} \frac{\Delta E}{g} \int se^{-sH}\ d^{3N}r d^{3N}p
\\&= \frac{1}{\omega} \frac{\Delta E}{g} \int s\exp[ -s (\sum_{i=1}^N \frac{\mathbf{p}_i^2}{2m_i} + U)] d^{3N}rd^{3N}p
\\&= \frac{1}{\omega} \frac{\Delta E}{g}(2\pi)^{3N/2} \prod_{i=1}^Nm_i^{3/2} s^{-(3N/2-1)} \int e^{-sU}d^{3N}r
\end{aligned}
By inverse Laplace transform we can get
$$ \frac{1}{kT} = \frac{1}{\omega} \frac{\Delta E}{g} (2\pi)^{3N/2} \prod_{i=1}^Nm_i^{3/2} \int \frac{(E-U)^{3N/2-2}}{\Gamma(3N/2-1)} \Theta(E-U) d^{3N}r$$
where $\Theta$ is the Heaviside step function. On the other hand, the ensemble average of the inverse of kinetic energy is
$$\langle K^{-1}\rangle = \langle (E-U)^{-1}\rangle = \frac{1}{\omega} \frac{\Delta E}{g} \int \frac{\delta(E-H)}{E-U} d^{3N}pd^{3N}r$$
The Laplace transform of this is
\begin{aligned}
\mathcal{L}\left[\langle K^{-1}\rangle\right] &= \frac{1}{\omega} \frac{\Delta E}{g} \int \frac{e^{-sH}}{H-U}d^{3N}pd^{3N}r
\\&= \frac{1}{\omega} \frac{\Delta E}{g} \int \frac{ \exp(-s\sum_{i=1}^N \frac{p_i^2}{2m_i}) }{ \sum_{i=1}^N \frac{p_i^2}{2m_i} } d^{3N}p \int \exp(-sU) d^{3N}r
\end{aligned}
To calculate the integral over $p$, we can make a coordinate transform $p^\prime_i = p_i/\sqrt{2m_i}$, and let $\xi^2 = \sum {p^\prime}^2$:
\begin{aligned}
\mathcal{L}\left[\langle K^{-1}\rangle\right] &= \frac{1}{\omega} \frac{\Delta E}{g} 2^{3N/2} \prod_{i=1}^Nm_i^{3/2} \int \frac{e^{-s\xi^2}}{\xi^2} d^{3N}p^\prime \int e^{-sU} d^{3N}r
\\&=\frac{1}{\omega} \frac{\Delta E}{g} 2^{3N/2} \prod_{i=1}^Nm_i^{3/2} \int_0^\infty \frac{e^{-s\xi^2}}{\xi^2} A(\xi)d\xi \int e^{-sU} d^{3N}r
\end{aligned}
where $A(\xi)$ is the surface of a 3N-dimension sphere with radius $\xi$ and is equal to
$$ A(\xi) = \frac{2\pi^{3N/2}}{\Gamma(3N/2)}\xi^{3N-1}$$
Substitution of this into the integration above yields
$$\mathcal{L}\left[\langle K^{-1}\rangle\right] = \frac{1}{\omega} \frac{\Delta E}{g} (2\pi)^{3N/2} \prod_{i=1}^N m_i^{3/2} \frac{1}{3N/2-1} \int \frac{e^{sU}}{s^{3N/2-1}} d^{3N}r$$
By inverse Laplace transform, we can finally get
$$\langle K^{-1} \rangle = \frac{1}{\omega} \frac{\Delta E}{g} (2\pi)^{3N/2} \prod_{i=1}^N m_i^{3/2} \frac{1}{3N/2-1} \int \frac{(E-U)^{3N/2-2}}{\Gamma(3N/2-1)} \Theta(E-U) d^{3N}r$$
By comparing this with the result of $1/kT$, we can easily find
$$\left(\frac{3N}{2}-1\right)kT=\left[\langle K^{-1}\rangle\right]^{-1}$$
So using different definitions of entropy yields different results. The authors elaborated on why they believe $S=k\ln\omega$ is the right one to use. They calculated the distribution of speeds using both $\omega$ and $\Omega$ and demonstrated that only the former is consistent with their MD simulation of 10 hard-sphere particles. Since the entropy can also be seen as an ensemble average ($S=-k\langle \ln\rho\rangle$, $\rho$ is the probability density of a microstate), it is reasonable to do it in the phase space $\omega$ too. Also, they found that $\langle K\rangle_\Omega=\sum\langle p_i^2\rangle_\Omega / 2m_i \not= \langle E-U\rangle_\Omega$ and this result is considered as not physical. Those who approve the use of $S=k\ln\Omega$ often cite the result that $\Omega$ is adiabatically invariant, but this reason is insufficient in the authors' opinion.
The authors also derived the result for $NVE\mathbf{PG}$ ensemble in which total momentum $\mathbf{P}$ and $\mathbf{G}=t\mathbf{P} - M_{total}r_c$ are also constant. This is important for molecular dynamics simulation because most of the time we do the "$NVE$ ensemble MD", we only simulate once and then calculate the time average. If there is no external force, what we get is actually a $NVE\mathbf{PG}$ ensemble. The results are slightly different:
$$\langle\frac{\mathbf{p}_i^2}{2m_i}\rangle = \frac{M_{totle}-m_i}{M_{total}(N-1)}\langle E-U\rangle + \frac{Nm_i-M_{total}}{2(N-1)M_{total}^2}\mathbf{P}^2$$
$$\left(\frac{3(N-1)}{2}-1\right)kT=\left[\left\langle \left(E-U-\frac{\mathbf{P}^2}{2M_{total}}\right) ^ {-1} \right\rangle\right]^{-1}$$
In the thermodynamic limit where $N$ is large, the result of the equipartition theorem and the results derived in this paper for $NVE$ and $NVE\mathbf{PG}$ ensemble become equivalent. So in that case it's still safe to use the equipartition theorem to calculate the temperature in $NVE$ MD simulations.
Best Answer
The density $\rho$ would count the number of microstates within the volume $d^{3N}p\,d^{3N}q$ that satisfies the energy constraint $E<\mathcal{H}<E+\Delta$. So you'd actually have:
\begin{align} \Gamma(E) &= \int_{E<\mathcal{H}<E+\Delta} d^{3N}p\,d^{3N}q \\ &= \int \rho\,d^{3N}p\,d^{3N}q \end{align}