The difference lies in the way we count the number of states of the system in quantum and classical cases.
The formulas you wrote are actually for the grand canonical partition functions for a single energy state, not for the whole system including all the energy states. The total grand canonical partition function is $$\mathcal{Z} = \sum_{all\ states}{e^{-\beta(E-N\mu)}} = \sum_{N=0}^\infty\sum_{\{E\}}{e^{-\beta(E-N\mu)}}$$
Now, if the particles are bosons, then the energy eigenstates are countable as $\{\epsilon_i\}$ and $\mathcal{Z}$ would be $$\mathcal{Z} = \sum_{\{n_i\}}e^{-\beta\sum_{i}{n_i(\epsilon_i-\mu)}} = \sum_{\{n_i\}}\prod_ie^{-\beta n_i(\epsilon_i-\mu)}=\prod_i \mathcal{Z_i}^{B-E}$$
where $n_i$ is the number of particles in $i$-th energy state, thus $\sum_i{n_i}=N$, and
$$\mathcal{Z_i}^{B-E} = \sum_{n=0}^\infty e^{-n\beta(\epsilon_i-\mu)}$$
Here, $\mathcal{Z_i}^{B-E}$ is the grand canonical partition function for one energy eigenstate with energy $\epsilon_i$ in Bose-Einstein statistics.
On the other hand, in classical regime the energy of a particle can take any energy. In this case, one point in the $6N$-dimensional phase space denotes one state of system. Therefore, the energy states are not countable as there is an infinite number of points in the phase space within any phase space volume. To count the states we take the "semi-classical" approach by taking the phase space volume of one state of the system to be $(2\pi\hbar)^{3N}$. We can then integrate over the whole phase space and divide the integral by this unit volume to get the number of states. However, as the particles are assumed to be indistinguishable, any permutation of the system configuration (the set of $\{\vec{x}_n,\ \vec{p}_n\}$) would actually be the same state of the system. Therefore, when we integrate over the whole phase space volume we overcount the total number of states by $N!$. That's why we need to divide the integral by Gibbs factor $N!$. For a system of non-interacting particles, the N-particle canonical partition function then can be written as $Z_N = \frac{Z_1^N}{N!}$ where $Z_1$ is the canonical partition function for one particle.
Now, the grand canonical partition function for a classical system would be
$$\begin{align}
\mathcal{Z}&=\sum_{N=0}^\infty{\int_0^\infty dE\ \Omega(E,N)e^{-\beta(E-\mu N)}}
=\sum_{N=0}^\infty e^{\beta\mu N} Z_N = \sum_{N=0}^\infty e^{\beta\mu N} \frac{Z_1^N}{N!}\\
\end{align}$$
If we want to derive the partition function $\mathcal{Z_i}$ for a single energy state similar to the Bose-Einstein statistics, we can assume the energy of a single particle to be discrete and countable as $\{\epsilon_i\}$. This can be achieved by dividing the one particle phase space into s of unit volume and assigning one representative energy to every unit volume section. Then, the grand canonical partition function is,
$$ \begin{align}
\mathcal{Z} &= \sum_{N=0}^\infty \frac{e^{\beta\mu N}}{N!} (\sum_i e^{-\beta\epsilon_i})^N\\
&=\sum_{N=0}^\infty \frac{e^{\beta\mu N}}{N!} \sum_{\{n_i\},\sum n_i = N} \frac{N!}{\prod_i n_i!} e^{-\beta\sum_i n_i\epsilon_i}\\
&=\sum_{\{n_i\}}\prod_i \frac{1}{n_i!} e^{-\beta n_i(\epsilon_i-\mu)}\\
&=\prod_i \sum_{n=0}^\infty \frac{1}{n!} e^{-\beta n(\epsilon_i-\mu)} = \prod_i \mathcal{Z_i}^{M-B}
\end{align}$$
This $\mathcal{Z_i}^{M-B}$ is the single energy state grand canonical partition function in Maxwell-Boltzmann statistics.
Maxwell-Boltzmann statistics is the classical limit for Bose-Einstein statistics. The condition for the system to be classical is the single state occupation number $\bar{n}$ to satisfy $\bar{n} \ll 1$, in other words, the total number of single particle states $M$ should satisfy $N \ll M$. As, the single particle partition function $Z_1$ is actually a weighted sum over all the states, $N\ll Z_1$ will satisfy $N \ll M$ for the system to be classical.
Why are fondamental particles either bosons or fermions?
A particle specie is either a boson or a fermion depending on how the wave function changes when permuting two particles of the same specie. A general two particles wave function $\Psi(x_1,x_2)$ can be acted on by the operator $P$ which permutes the two particles. Of course, $P^2\Psi(x_1,x_2) = \Psi(x_1,x_2)$ so $1$ is an eigenvalue of $P^2$ and so the only two possible eigenvalues for $P$ are $\pm1$, $+$ for bosons and $-$ for fermions.
Why are composites also either bosons or fermions?
To answer this, I will cite the excellent answer to the question "How to combine two particles of spin $\frac{1}{2}$?". The main equation to remember is following with it's interpretation:
$$
(2j_a+1)\otimes(2j_b+1) = \bigoplus_{i=1}^n(2j_i+1),
$$
On the left-handed side, the object describes the hilbert space of two particles, one with spin $j_a$ the other with spin $j_b$. The particles are decoupled so they have a fix total spin and can have any corresponding projection.
On the right-handed side, the object describes the hilbert space of a single composite particle which have can change total spin $j_i \in \{|j_a-j_b|, |j_a-j_b+1|, ..., j_a+j_b\}$.
Both spaces are related by a change of basis. What you need to see is that the total spin, even if it changes, is either an integer or an half-integer. If both particles are fermions or bosons, the composite particle is a boson, but else, the composite particle is a fermion.
Best Answer
The thing is that Maxwell-Boltzmann does hold for any system. Quantum particles, however, are not only identical, but completely indistinguishable. For such particles, the distribution is characterized solely by the occupation number.
Consider a system of noninteracting identical particles. Suppose the single particles have a known Hamiltonian $H$ with eigenstates $|k\rangle$, such that
$$H|k\rangle=E_k|k\rangle.$$
If we have a bunch of say, $N\gg1$ of these particles, we write the quantum state of the whole system in a different basis (the Fock basis), given by the number of particles in state $k, n_k$:
$$|n_1,n_2,...\rangle,$$
And this basis completely characterizes our quantum state. These are eigenstates of the many body Hamiltonian, with
$$H|n_1,n_2,...\rangle=\left(\sum\limits_kn_kE_k\right)|n_1,n_2,...\rangle.$$
Suppose then that our ensemble of particles has a state probability
$$p(n_1,n_2,...).$$
Maxwell-Boltzmann statistics is generated by maximizing the Gibbs Entropy: $$ S=-\sum\limits_{n_1,n_2,...}p(n_1,n_2,...)\log p(n_1,n_2,...),$$
Where $\log$ is the natural logarithm. We can then do it subject to the constraints of normalization, total energy and total number of particles:
$$\sum\limits_{n_1,n_2,...}p(n_1,n_2,...)=1,\quad\sum\limits_{n_1,n_2,...}p(n_1,n_2,...)\sum\limits_kn_kE_k=U,\quad\sum\limits_{n_1,n_2,..}p(n_1,n_2,...)\sum\limits_kn_k=N.$$
The maximization condition then corresponds to (using Lagrange multipliers, the calculation is a little involved)
$$p(n_1,n_2,...)=\prod\limits_k\frac{e^{-\beta n_k(E_k-\mu)}}{\sum\limits_{n_k'}e^{-\beta n_k'(E_k-\mu)}},$$
where $\beta$ and $\mu$ are parameters corresponding to the inverse temperature, $(k_BT)^{-1}$ and chemical potential. If we take the marginal distribution for a single occupation number, say, $n_j$, by summing over all other possible values for all the other occupation numbers, we get the probability distribution for the occupation number in question:
$$p_j(n)=\frac{e^{-\beta n(E_j-\mu)}}{\sum\limits_{n'}e^{-\beta n'(E_j-\mu)}}.$$
The step further taken by Bose-Einstein and Fermi-Dirac statistics is now computing the average, or expected occupation number of the quantum state $j$:
$$f_j=\langle n_j\rangle=\sum\limits_nnp_j(n).$$
A nice computational trick is then writing $f_j$ as follows:
$$f_j=\frac{\sum\limits_nne^{-\beta n(E_j-\mu)}}{\sum\limits_{n'}e^{-\beta n'(E_j-\mu)}}$$
$$f_j=\frac{1}{E_j-\mu}\frac{\sum\limits_nn(E_j-\mu)e^{-\beta n(E_j-\mu)}}{\sum\limits_{n'}e^{-\beta n'(E_j-\mu)}}$$ $$f_j=-\frac{1}{E_j-\mu}\frac{\partial}{\partial\beta}\log\left(\sum\limits_ne^{-\beta n(E_j-\mu)}\right).$$
As such, if we can compute that logarithm on the right, we can find $f_j$. Here, we need to make an assumption.
Bosons:
For bosons, $n$ can be any integer from $0$ to $\infty$. As such, we write
$$\sum\limits_{n=0}^\infty e^{-\beta n(E_j-\mu)}=\frac{1}{1-e^{-\beta(E_j-\mu)}},$$ a convergent geometric series. We now plug this into our expression for $f_j$ and get
$$f_j=\frac{1}{e^{\beta(E_j-\mu)}-1},$$ which is Bose-Einstein statistics.
Fermions
For fermions, $n$ can be either 0 or 1. We then write the sum over $n$ as
$$1+e^{-\beta(E_j-\mu)},$$
Which then gives us
$$f_j=\frac{1}{e^{\beta(E_j-\mu)}+1},$$
The occupation number of Fermi-Dirac statistics.
EDIT
I just realized that I got ridiculously deep into the mathematics and forgot to anwer the question. The intuition behind all these calculations.
Thing is, in its essence, Maxwell-Boltzmann statistics is concerned with system states, and Bose-Einstein/Fermi-Dirac are concerned with particle states. It's easy to get both these concepts mixed up, but they are fundamentally different. While M-B gives us the probability for our system (all of our $N$ particles) to be in a specific configuration, B-E/F-D gives us the average number of particles that are in a specific state.
Since, by definition, these are averages, we have to average over all configurations of the system. And that is exactly what we did here.
It's also nice to mention that we can compute $\mu$ by solving the equation
$$\sum\limits_jf_j=N.$$