I) The main point is that the half-angle $\frac{\theta}{2}$ doubles when we go from the ket $$\tag{1} |\psi\rangle~=~\begin{bmatrix}\cos\frac{\theta}{2} \cr e^{i\phi}\sin\frac{\theta}{2}\end{bmatrix},
\qquad ||\psi||~=~1, $$ to the density matrix/operator
$$\tag{2}\rho~=~| \psi\rangle \langle\psi | ~=~\frac{1}{2}\left({\bf 1}_{2\times 2}+ \vec{r}\cdot \vec{\sigma}\right),\qquad {\rm tr}\rho~=~1. $$
In eq. (2)
$$\tag{3}\vec{r}~=~\begin{bmatrix}x\cr y\cr z\end{bmatrix}~=~\begin{bmatrix}r\cos\phi\sin\theta\cr r\sin\phi\sin\theta\cr r\cos\theta\end{bmatrix},\qquad r~=~1, $$
is the radius vector in spherical coordinates, and $\sigma_i$ are the Pauli matrices. (We mention for completeness that the Bloch sphere $S^2=\partial B^3$ of pure qubit states is the boundary of the Bloch ball $B^3$ of mixed qubit states.)
II) Alternatively, for a detailed group theoretical explanation of the presence of the half-angle, consult e.g. Ref. 1. In short, the 2-dimensional Hilbert space $H\cong\mathbb{C}^2$ of the qubit is
a spinor/dublet representation of the $G=SU(2)$ Lie group, which is a double cover of the 3D rotation group $SO(3)$. The adjoint representation
$$\tag{4}{\rm Ad}:~ G ~\longrightarrow~GL(su(2),\mathbb{R}), $$
given by
$$\tag{5} {\rm Ad}(g)\sigma~=~g\sigma g^{-1}, \qquad g~\in~G, \qquad \sigma~\in~su(2)~\cong ~\mathbb{R}^3, $$
is a Lie group homomorphism, whose image
$$\tag{6} {\rm Ad}(G)~\cong ~SO(3), \qquad {\rm Ad}(\pm {\bf 1}_{2\times 2})~=~{\bf 1}_{3\times 3},$$
is isomorphic to $SO(3)$. The doubling of angles implicitly takes place in formula (5).
(This is similar to the fact that a half-spin particle requires a $4\pi$ rotation (rather than $2\pi$) to get back to the starting point.)
References:
- G. 't Hooft, Introduction to Lie Groups in Physics, lecture notes, chapter 6. The pdf file is available here.
The Bloch sphere is beautifully minimalist.
Conventionally, a qubit has four real parameters; $a e^{i\chi} |0\rangle + b e^{i\phi} |1\rangle.$ However, some quick insight reveals that the $a$-vs-$b$ tradeoff only has one degree of freedom due to the normalization $a^2 + b^2 = 1$ and some more careful insight reveals that, in the way we construct expectation values in QM, you cannot observe $\chi$ or $\phi$ themselves but only the difference $\chi - \phi$, which is $2\pi$-periodic. (This is covered further in the comments below but briefly: QM only predicts averages $\langle \psi|\hat A|\psi\rangle$ and shifting the overall phase of a wave function by some $|\psi\rangle\mapsto e^{i\theta}|\psi\rangle$ therefore cancels itself out in every prediction.)
So if you think at the most abstract about what you need, you just draw a line from 0 to 1 representing the $a$-vs-$b$ tradeoff: how much is this in one of these two states? Then you draw circles around it: how much is the phase difference? What stops it from being a cylinder is that the phase difference ceases to matter when $a=1$ or $b=1$, hence the circles must shrink down to points. Et voila, you have something which is topologically equivalent to a sphere. The sphere contains all of the information you need for experiments, and nothing else.
It’s also physical, a real sphere in 3D space.
This is the more shocking fact. Given only the simple picture above, you could be forgiven for thinking that this was all harmless mathematics: no! In fact the quintessential qubit is a spin-$\frac 12$ system, with the Pauli matrices indicating the way that the system is spinning around the $x$, $y$, or $z$ axes. This is a system where we identify $|0\rangle$ with $|\uparrow\rangle$, $|1\rangle$ with $|\downarrow\rangle$, and the phase difference comes in by choosing the $+x$-axis via $|{+x}\rangle = \sqrt{\frac 12} |0\rangle + \sqrt{\frac 12} |1\rangle.$
The orthogonal directions of space are not Hilbert-orthogonal in the QM treatment, because that’s just not how the physics of this system works. Hilbert-orthogonal states are incommensurate: if you’re in this state, you’re definitely not in that one. But this system has a spin with a definite total magnitude of $\sqrt{\langle L^2 \rangle} = \sqrt{3/4} \hbar$, but only $\hbar/2$ of it points in the direction that it is “most pointed along,” meaning that it must be distributed on some sort of “ring” around that direction. Accordingly, when you measure that it’s in the $+z$-direction it turns out that it’s also sort-of half in the $+x$, half in the $-x$ direction. (Here “sort-of” means: it is, if you follow up with an $x$-measurement.)
So let’s ask “which direction is the spin-$\frac12$ most spinning in?” This requires constructing an observable. To give an example, if the $+z$-direction is most-spun-in by a state $|\uparrow\rangle$ then the observable for $z$-spin is the Pauli matrix $\sigma_z = |\uparrow\rangle\langle\uparrow| - |\downarrow\rangle\langle\downarrow|,$ $+1$ in that state, $-1$ in the Hilbert-perpendicular state $\langle \downarrow | \uparrow \rangle = 0.$ Similarly if you look at $\sigma_x = |\uparrow\rangle \langle \downarrow | + |\downarrow \rangle\langle \uparrow |$ you will see that the $|{+x}\rangle$ state defined above is an eigenvector with eigenvalue +1 and similarly there should be a $|{-x}\rangle \propto |\uparrow\rangle - |\downarrow\rangle$ satisfying $\langle {+x}|{-x}\rangle = 0,$ and you can recover $\sigma_x = |{+x}\rangle\langle{+x}| - |{-x}\rangle\langle{-x}|.$
Then the state orthogonal to $|\psi\rangle = \alpha |0\rangle + \beta |1\rangle$ is $|\bar \psi\rangle = \beta^*|0\rangle - \alpha^* |1\rangle,$ so the observable which is +1 in that state or -1 in the opposite state is:$$
\begin{align}
|\psi\rangle\langle\psi| - |\bar\psi\rangle\langle\bar\psi| &= \begin{bmatrix}\alpha\\\beta\end{bmatrix}\begin{bmatrix}\alpha^*&\beta^*\end{bmatrix} - \begin{bmatrix}\beta^*\\-\alpha^*\end{bmatrix} \begin{bmatrix}\beta & -\alpha\end{bmatrix}\\
&=\begin{bmatrix}|\alpha|^2 - |\beta|^2 & 2 \alpha\beta^*\\
2\alpha^*\beta & |\beta|^2 - |\alpha|^2\end{bmatrix}
\end{align}$$Writing this as $v_i \sigma_i$ where the $\sigma_i$ are the Pauli matrices we get:$$v_z = |\alpha|^2 - |\beta|^2,\\
v_x + i v_y = 2 \alpha^* \beta.$$
Now letting $\alpha = \cos(\theta/2)$ and $\beta = \sin(\theta/2) e^{i\phi}$ we find out that these are:$$\begin{align} v_z &= \cos^2(\theta/2) - \sin^2(\theta/2) &=&~ \cos \theta,\\
v_x &= 2 \cos(\theta/2)\sin(\theta/2) ~\cos(\phi) &=&~ \sin \theta~\cos\phi, \\
v_y &= 2 \cos(\theta/2)\sin(\theta/2) ~\sin(\phi) &=&~ \sin \theta~\sin\phi.
\end{align}$$So the Bloch prescription uses a $(\theta, \phi)$ which are simply the spherical coordinates of the point on the sphere which such a $|\psi\rangle$ is “most spinning in the direction of.”
So instead of being a purely theoretical visualization, we can say that the spin-$\frac 12$ system, the prototypical qubit, actually spins in the direction given by the Bloch sphere coordinates! (At least, insofar as a spin-up system spins up.) It is ruthlessly physical: you want to wave it away into a mathematical corner and it says, “no, for real systems I’m pointed in this direction in real 3D space and you have to pay attention to me.”
How these answer your questions.
Yes, N and S are spatially parallel but in the Hilbert space they are orthogonal. This Hilbert-orthogonality means that a system cannot be both spin-up and spin-down. Conversely the lack of Hilbert-orthogonality between, say, the $z$ and $x$ directions means that when you measure the $z$-spin you can still have nonzero measurements of the spin in the $x$-direction, which is a key feature of such systems. It is indeed a little confusing to have two different notions of “orthogonal,” one for physical space and one for the Hilbert space, but it comes from having two different spaces that you’re looking at.
One way to see why the angles are physically very useful is given above. But as mentioned in the first section, you can also view it as a purely mathematical exercise of trying to describe the configuration space with a sphere: then you naturally have the polar angle as the phase difference, which is $2\pi$-periodic, so that is a naturally ‘azimuthal’ coordinate; therefore the way that the coordinate lies along 0/1 should be a ‘polar’ coordinate with $0$ mapping to $|0\rangle$ and $\pi$ mapping to $|1\rangle$. The obvious way to do this is with $\cos(\theta/2)$ mapping from 1 to 0 along this range, as the amplitude for the $|0\rangle$ state; the fact that $\cos^2 + \sin^2 = 1$ means that the $|1\rangle$ state must pick up a $\sin(\theta/2)$ amplitude to match it.
Best Answer
In general, if $\vert\chi_a\rangle$ and $\vert \chi_b\rangle$ are any two spin states, the density operator $$ \hat \rho=w_a \vert \chi_a\rangle\langle \chi_a\vert + w_b\vert\chi_b\rangle\langle \chi_b\vert\, ,\qquad w_a+w_b=1\, . $$ describes a statistical mixture where a fraction $w_a$ of the particles are in the state $\vert\chi_a\rangle$ and a fraction $w_b$ in the state $\vert\chi_b\rangle$.
Note that a Stern-Gerlach magnet oriented along $\vec \langle S_a\rangle =\left(\langle \sigma_{ax}\rangle,\langle\sigma_{ay}\rangle,\langle\sigma_{az}\rangle\right)$ would deflect particles in the state $\vert \chi_a\rangle$ "up", but would split the particles in state $\vert\chi_b\rangle$ into an up and down beam, and conversely a Stern-Gerlach magnet oriented along $\langle \vec S_b\rangle$ would split the particles in state $\vert\chi_a\rangle$ but only deflect up particles in $\vert\chi_b\rangle$. This is equivalent to stating that no direction of a Stern-Gerlach magnet would result is an up-only deflection of all particles, or alteratively, that the system described by $\hat \rho$ is not an eigenstate of spin in any direction.
More generally, when $\vert\chi_a\rangle$ and $\vert\chi_b\rangle$ are not orthogonal one can expand on the $\vert \pm\textstyle\frac{1}{2}\rangle$ basis \begin{align} \vert{\chi_a}\rangle &=a_+\vert{+\textstyle\frac{1}{2}}\rangle+a_-\vert{-\textstyle\frac{1}{2}}\rangle \\ \vert{\chi_b}\rangle &=b_+\vert{+\textstyle\frac{1}{2}}\rangle+b_-\vert{-\textstyle\frac{1}{2}}\rangle \end{align} so that \begin{align} \vert{\chi_a}\rangle\langle \chi_a\vert = \left( \begin{array}{cc} a_+a_+^* & a_+a_-^* \\ a_- a_+^* & a_- a_-^* \\ \end{array} \right), \end{align} and similarly from $\vert\chi_b\rangle \langle \chi_b\vert$.
The density operator becomes \begin{align} \hat \rho&=w_a\left( \begin{array}{cc} a_+ a_+^* & a_+a_-^* \\ a_-a_+^* & a_-a_-^* \\ \end{array} \right)+w_b\left( \begin{array}{cc} b_+b_+^* & b_+b_-^* \\ b_-b_+^* & b_-b_-^* \\ \end{array} \right) \\ &=\left( \begin{array}{cc} w_a\vert a_+\vert^2+w_b\vert b_+\vert ^2 & w_aa_+a_-^*+w_bb_+b_-^* \\ w_aa_-a_+^*+w_bb_-b_+^* & w_a\vert a_-\vert ^2+w_b\vert b_-\vert^2 \\ \end{array} \right) \end{align}
You can pick the coefficients $a_\pm$ and $b_\pm$ to your heart's content, making sure that $\vert a_+\vert^2+\vert a_-\vert^2=1$ (likewise for $b_\pm$) and compute using this $\hat\rho$ the vector $$ \langle \vec S\rangle = \left(\langle \sigma _x\rangle,\langle \sigma _y\rangle,\langle \sigma _z\rangle\right) $$ with $$ \langle \sigma_k\rangle =\hbox{Tr}\left(\hat \rho\sigma_k\right) $$ and you will find that, unless $w_a=1$ and $w_b=0$ or vice versa, $\rho^2\ne \rho$ and the length of $\langle \vec S\rangle$ is not $1$ so your vector will lie inside the Bloch sphere. When $\rho^2\ne \rho$, the state is pure and then your vector will be on the surface of the Bloch sphere.
As a simple example, choose $a_+=i, a_-=0$, $b_+=\frac{1}{2}$, $b_-=-\frac{\sqrt{3}}{2}$. Then $$ \rho= \left(\begin{array}{cc}\frac{1}{4}w_b+w_a&-\frac{\sqrt{3}}{4}w_b\\-\frac{\sqrt{3}}{4}w_b&\frac{3}{4}w_b \end{array}\right) $$ Note that the trace of this matrix is $w_a+w_b=1$. It's easy to figure out that \begin{align} \langle \vec S_a\rangle &= (0,0,1)\\ \langle \vec S_b\rangle &= \left(-\textstyle\frac{\sqrt{3}}{2},0,-\frac{1}{2}\right)\\ \langle \vec S\rangle &= \left(-\textstyle\frac{\sqrt{3}}{2}w_b,0,(w_a-\frac{w_b}{2})\right) \end{align} and that $\langle \vec S\rangle$ has length $\sqrt{w_a^2-w_aw_b+w_b^2}$ so does not lie on the surface of the Bloch sphere unless one of $w_a$ or $w_b$ is $1$ and the other $0$. In this way, the average $\langle S_{x,y,z}\rangle$ is the statistical average of each $\langle S_{a,xyz}\rangle $ and $\langle S_{b,xyz}\rangle $ with $w_a$ and $w_b$ entering as statistical weights.