Yes, the density matrix reconciles all quantum aspects of the probabilities with the classical aspect of the probabilities so that these two "parts" can no longer be separated in any invariant way.
As the OP states in the discussion, the same density matrix may be prepared in numerous ways. One of them may look more "classical" – e.g. the method following the simple diagonalization from equation 1 – and another one may look more quantum, depending on states that are not orthogonal and/or that interfere with each other – like equations 2.
But all predictions may be written in terms of the density matrix. For example, the probability that we will observe the property given by the projection operator $P_B$ is
$$ {\rm Prob}_B = {\rm Tr}(\rho P_B) $$
So whatever procedure produced $P_B$ will always yield the same probabilities for anything.
Unlike other users, I do think that this observation by the OP has a nontrivial content, at least at the philosophical level. In a sense, it implies that the density matrix with its probabilistic interpretation should be interpreted exactly in the same way as the phase space distribution function in statistical physics – and the "quantum portion" of the probabilities inevitably arise out of this generalization because the matrices don't commute with each other.
Another way to phrase the same interpretation: In classical physics, everyone agrees that we may have an incomplete knowledge about a physical system and use the phase space probability distribution to quantify that. Now, if we also agree that probabilities of different, mutually excluding states (eigenstates of the density matrix) may be calculated as eigenvalues of the density matrix, and if we assume that there is a smooth formula for probabilities of some properties, then it also follows that even pure states – whose density matrices have eigenvalues $1,0,0,0,\dots$ – must imply probabilistic predictions for most quantities. Except for observables' or matrices' nonzero commutator, the interference-related quantum probabilities are no different and no "weirder" than the classical probabilities related to the incomplete knowledge.
If $|\psi\rangle$ is a n-electron state and
$$
\psi(x_1, x_2, x_3, \dots, x_n) = \langle x_1, x_2, x_3, \dots, x_n | \psi\rangle
$$
is its (anti-symmetric) n-electron wave function in position representation, then
$$
\frac{1}{n} {\hat\Gamma} = Tr_{2,3,\dots, n} |\psi\rangle \langle \psi| = \int{dx_1 dx'_1\,|x_1\rangle \,\Big( \int{dx_2 dx_3 \dots dx_n \psi^*(x_1, x_2, x_3, \dots, x_n) \psi(x'_1, x_2, x_3, \dots, x_n)\; }\Big) \langle x'_1| }
$$
is the single-electron reduced density matrix and
$$
\frac{1}{n}\Gamma(x, x') = \langle x | {\hat \Gamma} | x' \rangle
$$
gives its matrix elements in position representation. Notice that since $\psi(x_1, x_2, x_3, \dots, x_n)$ is anti-symmetric in the electron coordinates, it doesn't matter which electrons are traced out.
The equation
$$
\frac{1}{n} {\hat\Gamma} |\Phi_k\rangle = p_k |\Phi_k\rangle
$$
is indeed the eigenvalue equation for the reduced single-electron density matrix ${\hat \Gamma}$ and the NBOs $|\Phi_k\rangle$ are its eigenstates. This means that an electron can be found with probability $p_k$ in $|\Phi_k\rangle$. Or in density functional theory (DFT) lingo, the (possibly fractional) occupation number of orbital $\Phi_k$ is $n_k = np_k$. Otherwise, everything that pertains to (reduced) density matrices applies to ${\hat \Gamma}$ as well. So,
1a. Is $\rho|\psi\rangle$ a legal operation outside the context of eigenvalue equations?
Yes, absolutely, $\rho$ is just another operator.
1b. If it is legal, what physical context will it arises. how to interpret the resulting state vector?
It doesn't necessarily mean that $\rho|\psi\rangle$ has some special physical significance. On the other hand, for any state $|\psi\rangle$ the matrix element $\langle \psi |\rho |\psi\rangle$ has the meaning of "probability to measure the system (electron) in the pure state $|\psi\rangle$ when it is in the mixed state $\rho$". For ${\hat \Gamma}$ this means $\langle \Phi |\frac{1}{n}{\hat \Gamma} |\Phi\rangle$ gives the probability to find an electron in orbital $|\Phi\rangle$ if the reduced single-electron state is $\frac{1}{n}{\hat \Gamma}$.
2a. Since a density matrix is a matrix representation of the density operator, given any linear operator ${\hat O}$ that is used in quantum mechanics and is not necessary an observable nor the unitary time evolution operator $U(t,t′)$, is the product ${\hat O}\rho$ legal? What is its meaning?
Yes, the product ${\hat O}\rho$ is well-defined, but again, it doesn't have special physical significance by itself.
2b. Is $\rho_1\rho_2$, the multiplication of one density matrix onto a different one legal?
Again, yes. But it usually doesn't arise when $\rho_1$ and $\rho_2$ both refer to the same single electron coordinates (it is tempting to say "the same electron", but then electrons are indistinguishable, so we can't possibly know that it is "the same"). It usually means a direct product $\rho_1 \otimes \rho_2$ for two electrons (two sets of electron coordinates).
2c. What are examples of physical or experimental scenario where the need to project a mixed state to another mixed state arises?
It's not a projection!
Technically all quantum mechanical equations are linear in the density matrix of the system, so whenever you see this sort of product you know two things:
there are (at least) two electrons involved and the (direct) product $\rho_1 \otimes \rho_2 $ refers to (part of) a two-electron state ;
very likely there is some self-consistent approximation about the nature of the two-electron reduced density matrix (obtained by tracing the n-electron density matrix $|\psi\rangle\langle \psi|$ over all but two electrons). This is because the two-electron density matrix is a product of single electron density matrices only when the electrons are uncorrelated.
In DFT matrix elements of products for $\rho_1 = \rho_2 = \rho$ are ubiquitous and you can easily find them in a variety of quantities: the Hartree energy, the exchange energy, the two-electron correlation function, the exchange correlation function, etc.
Best Answer
That depends on what you understand "density matrix" to mean. You seem to think that it refers to operators of the form $$\rho=\sum_{k=1}^n p_k|\psi_k\rangle\langle\psi_k|,\tag{1}$$ where $p_k\geq0$ for all $k$, $\sum_{k=1}^n p_k=1$, and the $|\psi_k\rangle$ are vectors in some $N$-dimensional Hilbert space $\mathcal{H}$. As a contrast, QMechanic's answer relies on a characterization of density matrices as positive semidefinite hermitian operators with trace 1. Proving the equivalence of these definitions is a very informing exercise and it will probably teach you more than your current problem.
To prove that $\rho=p_1\rho_1+p_2\rho_2$ is a density matrix by your definition, you will need to rely on an eigenvalue-eigenvector decomposition. Writing $\rho$ as in equation (1) is possible because $\rho$ is hermitian; the problem is then proving the two conditions on the $p_k$. The first one is equivalent to $\rho$ being positive semidefinite (why?), and this you can prove using the (real) abstract definition of that: $$\langle v|\rho v\rangle\geq0\,\forall v\in \mathcal{H}.$$ The sum condition you can prove by taking the trace of the different expressions you have for $\rho$.