Yes, subsystems of an entangled state – if this subsystem is entangled with the rest – is always in a mixed state or "statistical mixture" which is used as a synonym in your discussion (or elsewhere).
If we're only interested in predictions for a subsystem $A$ in a system composed of $A,B$, then $A$ is described by a density matrix $\rho_A$ calculable by "tracing over" the indices of the Hilbert space for $B$:
$$\rho_A = {\rm Tr}_{i_b} \rho_{AB}$$
Note that if the whole system $AB$ is in a pure state,
$$\rho_{AB}= |\psi_{AB}\rangle\langle \psi_{AB}| $$
If $\psi_{AB}$ is an entangled i.e. not separable state, i.e. if it cannot be written as $|\psi_A\rangle\otimes |\psi_B\rangle$ for any states $|\psi_A\rangle$ and $|\psi_B\rangle$, then the tracing over has the effect of picking all the terms in $|\psi_{AB}\rangle$, forgetting about their dependence on the $B$ degrees of freedom, and writing their probabilities on the diagonal of $\rho_{AB}$. That's why the von Neumann entropy will be nonzero – the density matrix will be a diagonal one in a basis and there will be at least two entries that are neither $0$ nor $1$.
Take a system of two qubits. We have qubit $A$ and qubit $B$. There are 4 natural basis vectors for the two qubits, $|00\rangle$, $|01\rangle$, $|10\rangle$, and $|11\rangle$ where the first digit refers to the value of $A$ and the second digit to $B$. A general pure state is a superposition of these four states with four coefficients $\alpha_{AB}$ where $A,B$ are $0,1$, matched to the corresponding values.
If $\alpha_{AB}$ may be written as $\beta_A\gamma_B$ i.e. factorized in this way, the pure state is separable. $|01\rangle$ is separable, for example. If it is not, then it is entangled. For example, $|00\rangle+|11\rangle$ is not separable so it is entangled.
The mixed state is a more general state than a pure state. In this case, it is given by a $4\times 4$ Hermitian matrix $\rho$. The matrix entries are $\rho_{AB,A'B'}$ where the unprimed and primed indices refer to the values of qubits $AB$ in the bra and ket vectors, respectively. If these matrix entries may be factorized to
$$\rho_{AB,A'B'} = \alpha^*_{AB}\alpha_{A'B'}$$
for some coefficients $\alpha_{A'B'}$ and their complex conjugates that specify a pure state $|\psi_{AB}\rangle$, then the density matrix $\rho$ is equivalent to the pure state $|\psi_{AB}\rangle$ and we say that the system is in a pure state. In the more general case, $\rho$ can't be written as this factorized product but only as a sum of similar products. If you need at least two terms like that to write $\rho$, then the state is mixed and the von Neumann entropy is therefore nonzero.
To start with the double slit experiment gives interference even when the beam is composed by one photon at a time. The spot on the screen a photon/particle the statistical accumulation the interference seen as expected classically too.
The joint comes because the photon as a quantum mechanical entity has a wavefunction that is the solutions of Maxwell's equation, treated as operators operating on the wave function. The E=h*nu identity the photon carries is the same as the frequency of the classical solution of Maxwell's equation and that , together with the phase attached to the wavefunction allow the continuity of classical down to quantum single photon level For a formal treatment how classical emerges from an ensemble of photons have a look at this blog entry.
There exists a fundamental difference between classical and quantum equations and their solutions, but also a continuity at the interface.
Edit after question edit
How is the theory of partial coherent light related to quantum-mechanics?
This needs somebody familiar with the formalism of both, but I believe the connection should follow the method in the link, how classical electromagnetic beams emerge from an ensemble of photons.
]>... the amplitude of a wave function ... But is this really a fundamental difference, or just a difference in the common practices of the respective theories?
the square of the wavefunction is the connection with predictions and experiments in quantum mechanics, it is the probabilistic nature that makes the difference with the classical framework, as far as I know.
How much of the strange phenomena of quantum-mechanics can be explained by the theory of partial coherent light alone, without any reference to particles or measurement processes?
Phenomenon is " an observable" , observing something implies a measurement process, measurement implies interaction, picking up a point that will contribute to the quantum mechanical probability distribution (or building up the distribution itself by continuous observations) so there is an inherent contradiction in this part of the question.
Best Answer
The confusion arises because the word “coherent” evolved to have different meanings in different contexts where it is not fully qualified.
Going back to the 2-slit experiment, one shows that the intensity of the signal at a particular point $$ I_{tot}(x)\ne I_{1}(x)+I_{2}(x)\tag{1} $$ is not the simple sums of intensities of the signals from the two source slit. This is because the light from the slits is “coherent” in the sense that the signals can interfere at point. This website gives some details but basically the intensity at one point is of the form $$ I_{tot}(x)= (A(x)+B(x))^2\tag{2} $$ with cross-terms of the type $A(x)B(x)$ typical of interference between terms. (Despite the efforts of generations of students, $(A+B)^2\ne A^2+B^2$ so (2) CANNOT be the same as (1) in general).
This opposes incoherent light, where the intensity at a point is just the sum of individual intensities of the different sources: $I_{tot}=I_1+I_2$. This is what happens if you shine two flashlights at a wall: the intensity of the light is just the sum of the intensity from the two flashlights: there are no dark and bright fringes of interference.
Now, in a linear combination of wave functions, say $$ \psi(x) = \alpha \psi_1(x) +\beta \psi_2(x). \tag{3} $$ the various parts can, in general, interfere in the sense that the probability density \begin{align} \vert \psi(x)\vert^2 &= \vert \alpha\vert^2 \vert\psi_1(x)\vert^2+ \vert\beta\vert^2\vert\psi_2(x)\vert^2 \\ &\quad + \alpha^*\beta\psi_1(x)^*\psi_2(x) +\alpha\beta^*\psi_1(x)\psi_2(x)^* \end{align} is not just the sum of the probability densities of the individual components, i.e. it is contains cross terms of the type $$ \alpha^*\beta\psi_1(x)^*\psi_2(x)+\hbox{c.c.} $$ and is therefore reminiscent of (2). Thus we speak here of “coherent superposition”. The state of (3) is actually a pure state.
In a mixed state (which cannot be described by a wavefunction), the probability density is a sum of individual probability densities, i.e. something like $\vert\psi(x)\vert^2 =\vert\alpha \psi_1(x)\vert^2+\vert \beta\psi_2(x)\vert^2$ without the interference term. Note that $\psi_1(x)$ could itself be a sum, i.e. $\psi_1(x)=a \phi(x)+b \chi(x)$ so that $\vert\psi_1(x)\vert^2 = \vert a\phi(x)+b\chi(x)\vert^2$ can have cross-terms, but there would be no cross-terms between the pieces in $\psi_1(x)$ and $\psi_2(x)$.
Now as to coherent states. Glauber investigated the question of coherence in quantum optics, i.e. the coherence properties of the quantized electromagnetic field. The tool of choice here is the correlation function, and Glauber was able to find a linear combination of harmonic oscillator states that was “coherent to all order” in the sense of the correlation function. These states Glauber naturally called “coherent states”. Coherent states are pure states so the various parts can inteference and they are thus coherent in the sense that cross-terms appear in the probability density. However, whereas all pure states are coherent superposition of basis states, not all of them are “coherent” in the sense that their correlation functions do not satisfy the condition set out by Glauber.
To make matters worse, Peremolov realized that the Glauber coherent states could be generalized mathematically. Perelomov observed that the Glauber coherent states could be written as $$ \vert\alpha\rangle = T(\alpha)\vert 0 \tag{4} $$ where $T(\alpha)$ is displacement in the plane: $$ T(\alpha)=e^{i(\alpha a^\dagger - \alpha^* a)}\vert 0\rangle\, . $$ Perelomov used this last property to introduced “generalized coherent states”, which are just displacements of some special state (see for instance Perelomov, A. (2012). Generalized coherent states and their applications. Springer Science & Business Media.) . Hence, “spin coherent states” are defined using rotations, i.e. displacements on the sphere, by $$ \vert\theta,\phi\rangle = R_z(\phi)R_y(\theta)\vert JJ\rangle\, . \tag{5} $$ (One can also displace the $\vert J,-J\rangle$ state.)
One can show that the Glauber coherent state $\vert\alpha\rangle$ of Eq.(4) turns out to be an eigenstate of the annihilation operator $a$, i.e. $a\vert\alpha\rangle=\alpha\vert\alpha\rangle$. Obviously this cannot happen when the Hilbert space is finite dimension so angular momentum coherent states of (5) are not eigenstates of either $J_+$ or $J_-$. However, they share with many properties of the Glauber coherent states. Both sets of states have minimum uncertainty (when the angular momentum operators are properly defined), and both states produce specific factorization properties when computing some quantities. Generalized coherent states are not limited to angular momentum but have been defined for a variety of cases, either by insisting they have minimum uncertainty or they are translate of some distinguished state.
Summary: pure states are coherent superpositions of basis states. Mixed states are incoherent superpositions of states. Glauber coherent states (or harmonic oscillator coherent states) are pure states but also satisfy additional properties as laid out by Glauber in terms of correlation functions. Generalized coherent states were introduced by Perelomov; they are pure states which share some properties of the Glauber coherent states.