I don't really see the answer in the other answer so let me do the calculation here. Your general Boltzmann Ansatz says that the probability of a state $n$ depends on its energy as
$$ p_n = C \exp(-\beta E_n) $$
where $\beta = 1/kT$. Fermions are identical particles that, for each "box" or one-particle state they can occupy (given e.g. by $nlms$ in the case of the Hydrogen atom-like states), admit either $N=0$ or $N=1$ particles in it. Higher numbers are forbidden by the Pauli exclusion principle. The energies of the multi-particle state with $N=1$ and $N=0$ in a particular one-particle state $nlms$ differ by $\epsilon$. Consequently,
$$ \frac{p_1}{p_0} = \frac{C\exp(-\beta (E+\epsilon))}{\exp(-\beta E)} = \exp(-\beta \epsilon) $$
where I used the Boltzmann distribution. However, the probabilities that the number of particles in the given one-particle state is equal to $N=0$ or $N=1$ must add to one,
$$ p_0 + p_1 = 1.$$
These conditions are obviously solved by
$$ p_0 = \frac{1}{1+\exp(-\beta\epsilon)}, \qquad p_1 = \frac{\exp(-\beta\epsilon)}{1+\exp(-\beta\epsilon)}, $$
which implies that the expectation value of $n$ is equal to the right formula for the Fermi-Dirac distribution:
$$\langle N \rangle = p_0\times 0 + p_1 \times 1 = p_1= \frac{1}{\exp(\beta\epsilon)+1} $$
The calculation for bosons is analogous except that the Pauli exclusion principle doesn't restrict $N$. So the number of particles (indistinguishable bosons) in the given one-particle state may be $N=0,1,2,\dots $. For each such number $N$, we have exactly one distinct state (because we can't distinguish the particles). The probability of each such state is called $p_n$ where $n=0,1,2,\dots$.
We still have
$$\frac{p_{n+1}}{p_n} = \exp(-\beta\epsilon) $$
and
$$ p_0 + p_1 + p_2 + \dots = 1 $$
These conditions are solved by
$$ p_n = \frac{\exp(-n\beta\epsilon)}{1+\exp(-\beta\epsilon)+\exp(-2\beta\epsilon)+\dots } $$
Note that the ratio of the adjacent $p_n$ is what it should be and the denominator was chosen so that all the $p_n$ from $n=0,1,2\dots$ sum up to one.
The expectation value of the number of particles is
$$ \langle N \rangle = p_0 \times 0 + p_1 \times 1 + p_2\times 2 + \dots $$
because the number of particles, an integer, must be weighted by the probability of each such possibility. The denominator is still inherited from the denominator of $p_n$ above; it is equal to a geometric series that sums up to
$$ \frac{1}{1-q} = \frac{1}{1-\exp(-\beta\epsilon)} $$
Don't forget that $1-\exp(-\beta\epsilon)$ is in the denominator of the denominator, so it is effectively in the numerator.
However, the numerator of $\langle N \rangle$ is tougher and contains the extra factor of $n$ in each term. Nevertheless, the sum is analytically calculable:
$$ \sum_{n=0}^\infty n \exp(-n \beta\epsilon) = - \frac{\partial}{\partial (\beta\epsilon)} \sum_{n=0}^\infty \exp(-n \beta\epsilon) =\dots$$
$$\dots = - \frac{\partial}{\partial (\beta\epsilon)} \frac{1}{1-\exp(-\beta\epsilon)} = \frac{\exp(-\beta\epsilon)}{(1-\exp(-\beta\epsilon))^2} $$
This result's denominator has a second power. One of the copies gets cancelled with the denominator before and the result is therefore
$$ \langle N \rangle = \frac{\exp(-\beta\epsilon)}{1-\exp(-\beta\epsilon)} = \frac{1}{\exp(\beta\epsilon)-1} $$
which is the Bose-Einstein distribution.
You could also obtain another version of the Boltzmann distribution for distinguishable particles by a similar calculation. For such particles, $N$ could take the same values as it did for bosons. However, the multiparticle state with $N$ particles in the one-particle state would be degenerate because the particles are distinguishable. There would actually be $N!$ multiparticle states with $N$ particles in them. The sum would yield a Taylor expansion for the same exponential.
Note added later: the derivation above was for $\mu=0$. When the chemical potential is nonzero, all appearances of $\epsilon$ have to be replaced by $(\epsilon-\mu)$. Of course that one may only talk about a well-defined value of $\mu$ when we deal with a grand canonical potential; it is impossible to derive a formula depending on $\mu$ from one that contains no $\mu$ and assumes it's ill-defined. The derivation above was meant to show that the difficult $1/(\exp\pm 1)$ structures do appear from a simpler $\exp(-\beta E)$ Ansatz because I think it's the only nontrivial thing to be shown while discussing the relations between the Boltzmann and BE/FD distributions. If that derivation proves the same link as the textbook does, then I apologize but I think there is "nothing else" of a similar kind to be proven.
The difference lies in the way we count the number of states of the system in quantum and classical cases.
The formulas you wrote are actually for the grand canonical partition functions for a single energy state, not for the whole system including all the energy states. The total grand canonical partition function is $$\mathcal{Z} = \sum_{all\ states}{e^{-\beta(E-N\mu)}} = \sum_{N=0}^\infty\sum_{\{E\}}{e^{-\beta(E-N\mu)}}$$
Now, if the particles are bosons, then the energy eigenstates are countable as $\{\epsilon_i\}$ and $\mathcal{Z}$ would be $$\mathcal{Z} = \sum_{\{n_i\}}e^{-\beta\sum_{i}{n_i(\epsilon_i-\mu)}} = \sum_{\{n_i\}}\prod_ie^{-\beta n_i(\epsilon_i-\mu)}=\prod_i \mathcal{Z_i}^{B-E}$$
where $n_i$ is the number of particles in $i$-th energy state, thus $\sum_i{n_i}=N$, and
$$\mathcal{Z_i}^{B-E} = \sum_{n=0}^\infty e^{-n\beta(\epsilon_i-\mu)}$$
Here, $\mathcal{Z_i}^{B-E}$ is the grand canonical partition function for one energy eigenstate with energy $\epsilon_i$ in Bose-Einstein statistics.
On the other hand, in classical regime the energy of a particle can take any energy. In this case, one point in the $6N$-dimensional phase space denotes one state of system. Therefore, the energy states are not countable as there is an infinite number of points in the phase space within any phase space volume. To count the states we take the "semi-classical" approach by taking the phase space volume of one state of the system to be $(2\pi\hbar)^{3N}$. We can then integrate over the whole phase space and divide the integral by this unit volume to get the number of states. However, as the particles are assumed to be indistinguishable, any permutation of the system configuration (the set of $\{\vec{x}_n,\ \vec{p}_n\}$) would actually be the same state of the system. Therefore, when we integrate over the whole phase space volume we overcount the total number of states by $N!$. That's why we need to divide the integral by Gibbs factor $N!$. For a system of non-interacting particles, the N-particle canonical partition function then can be written as $Z_N = \frac{Z_1^N}{N!}$ where $Z_1$ is the canonical partition function for one particle.
Now, the grand canonical partition function for a classical system would be
$$\begin{align}
\mathcal{Z}&=\sum_{N=0}^\infty{\int_0^\infty dE\ \Omega(E,N)e^{-\beta(E-\mu N)}}
=\sum_{N=0}^\infty e^{\beta\mu N} Z_N = \sum_{N=0}^\infty e^{\beta\mu N} \frac{Z_1^N}{N!}\\
\end{align}$$
If we want to derive the partition function $\mathcal{Z_i}$ for a single energy state similar to the Bose-Einstein statistics, we can assume the energy of a single particle to be discrete and countable as $\{\epsilon_i\}$. This can be achieved by dividing the one particle phase space into s of unit volume and assigning one representative energy to every unit volume section. Then, the grand canonical partition function is,
$$ \begin{align}
\mathcal{Z} &= \sum_{N=0}^\infty \frac{e^{\beta\mu N}}{N!} (\sum_i e^{-\beta\epsilon_i})^N\\
&=\sum_{N=0}^\infty \frac{e^{\beta\mu N}}{N!} \sum_{\{n_i\},\sum n_i = N} \frac{N!}{\prod_i n_i!} e^{-\beta\sum_i n_i\epsilon_i}\\
&=\sum_{\{n_i\}}\prod_i \frac{1}{n_i!} e^{-\beta n_i(\epsilon_i-\mu)}\\
&=\prod_i \sum_{n=0}^\infty \frac{1}{n!} e^{-\beta n(\epsilon_i-\mu)} = \prod_i \mathcal{Z_i}^{M-B}
\end{align}$$
This $\mathcal{Z_i}^{M-B}$ is the single energy state grand canonical partition function in Maxwell-Boltzmann statistics.
Maxwell-Boltzmann statistics is the classical limit for Bose-Einstein statistics. The condition for the system to be classical is the single state occupation number $\bar{n}$ to satisfy $\bar{n} \ll 1$, in other words, the total number of single particle states $M$ should satisfy $N \ll M$. As, the single particle partition function $Z_1$ is actually a weighted sum over all the states, $N\ll Z_1$ will satisfy $N \ll M$ for the system to be classical.
Best Answer
I will give you the argument that is presented in the book I learned from, Physical Gas Dynamics. For reference, it's page 104, Chapter 4 section 6. I present it instead because it does not rely on the Boltzmann distribution for the derivation so perhaps there is some additional insight for you. They established in section 5 that:
$$ \frac{N_j^*}{C_j} = \frac{1}{e^{\alpha + \beta \epsilon_j} \mp 1} $$
for the BE and FD statistics respectively. The other notation: $N_j^*$ is the number of particles in the energy state $j$, $C_j$ is the number of possible levels and the coefficients $\alpha$ and $\beta$ have yet to be determined.
At "sufficiently high" temperatures, $C_j \gg N_j^*$ and the only way for that to be true is if the denominator is large. And when the denominator is large, the exponential is much bigger than 1 so that term may be neglected. So both FD and BE statistics lead to the same expression:
$$ \frac{N_j^*}{C_j} = e^{-\alpha - \beta \epsilon_j} $$
Then they apply this to the number of microstates (after Sterling's approximation):
$$ \ln W = \Sigma_j \left[ \pm C_j \ln \left(1\pm\frac{N_j}{C_j}\right)+N_j\ln\left(\frac{C_j}{N_j}\pm 1\right)\right]$$
where the $\pm$ is for BE and FD respectively. So they plug in the expression for $N_j^*/C_j$ and further give the approximation: $\ln\left(1 \pm x\right) \approxeq \pm x$ for $x \ll 1$ to arrive at the Boltzmann limit:
$$ \ln W = \Sigma_j N_j \left( \ln \frac{C_j}{N_j} + 1 \right)$$
So the entire derivation really just hinges on the notion that at "sufficiently large" temperatures, there are considerably more energy states than particles and so the vast majority of them are empty. Furthermore, both FD and BE statistics have the same result because so many states are empty that the likelihood of two particles attempting to occupy the same level is very low.
This entire derivation relies only on information about quantum statistics and does not require the hand-waving needed to reach the same conclusion from classical mechanics like the derivation from the Boltzmann distribution does (so say the authors of the book immediately following the derivation).
It goes on further to say that $\epsilon_j$ has a wide range of values, including zero, and so the only way for the exponential to be large for all states is to have $\alpha \gg 1$. This is equivalent to (and it's left as an exercise):
$$ \frac{V}{N} \times \frac{\left(2 \pi m k T\right)^{3/2}}{h^3} \gg 1$$
which is violated for very small $m$ with a very large number density $N/V$.
Then it's fairly straight forward to eliminate $\alpha$ and come up with an expression with only $\beta$, which is not possible to solve practicably. So then they use the expression for the maximum number of microstates and perturb the solution about $N_j^*$ and neglect terms second or and higher of $\Delta N_j$.
Based on all of that, it is found that $\beta = \frac{1}{k T}$. But this hinges on the assumption that all possible microstates of a system are a priori equally probable.
Obviously I skipped a bunch of steps and equations in the derivation, but I highly recommend this book (or any other that shows the derivation from the quantum mechanics side rather than from the Maxwell-Boltzmann distribution side).