A covariance matrix has ${n \choose 2} + n = \frac{n(n+1)}{2}$ free elements. The constraints for the spectral decomposition are:
- The eigenvalues are positive
- The eigenvectors are orthogonal
- The eigenvectors are unit length.
I recently asked a question that had to do with this. @amoeba had a good comment that helped visualize these constraints and proved why the number of free elements in $\Psi$ was ${n \choose 2}$, and the number of free elements in $\Lambda^2$ was $n$.
Regarding the ordering of the eigenvalues, that may or may not be important to you. The spectral decomposition is only unique up to re-orderings of your $\Lambda^2$ diagonal matrix. If your task is to estimate the parameters of this matrix, each of these re-orderings will have corresponding estimates. In this case the model will not be identifiable.
However, if you are just theorizing about the covariance matrix, people sometimes just restrict their attention to one ordering of the eigenvalues, without losing any generality. Proving something is true for one ordering will usually guarantee truth for the other orderings.
Edit to address comment:
1,2 and 3 are constraints that every covariance matrix has, so it is as "free" as possible. Sampling from some distribution of $\Sigma$ is possible as long as long as the distribution exists, but it is also common to restrict the columns of $\Psi$ further, which is the same as fixing the ordering of your eigenvalues. What I mean is that sometimes they will restrict it to have $\psi_{i,j} = 0$ when $j > i$ and $\psi_{j,j} = 1$. In other words, it will be lower-diagonal with $1$s on the diagonal. The are other ways to restrict this space, but it would look something like this:
$$
\Psi =
\left[\begin{array}{cccc}
1 & 0 & 0 & 0 \\
\psi_{2,1} & 1 & 0 & 0 \\
\vdots & \vdots & \ddots & \vdots \\
\psi_{p,1} & \psi_{p,2} & \cdots & 1 \\
\vdots & \vdots & \ddots & \vdots \\
\psi_{n,1} & \cdots & \psi_{n,p-1} & \psi_{n,p}
\end{array} \right].
$$
Why is this representation unique? If you re-order the diagonal matrix $\Lambda^2$, then you have to re-order the columns of $\Psi$, but then the columns of $\Psi$ won't follow this pattern. Why do this? Well now your posterior is not multi-modal.
Best Answer
$x^T \Sigma^{-1} x$ is a quadratic form, closely related to the notion of Mahalanobis distance. It's kind of a generalization of standard Euclidean distance in case when not all directions are equal. This is useful when the variation along one direction is much bigger that along the other one, so small distance along the later one is quite significant.
One can decompose $\Sigma^{-1}$ into $Q^T \Lambda Q = Q^T \Lambda^{1/2} \Lambda^{1/2} Q = (\Lambda^{1/2} Q)^T (\Lambda^{1/2} Q)$ (where $\Lambda$ is diagonal matrix of inverted eigenvalues of $\Sigma$, and $Q$ is orthogonal), so when applied to vectors,
$$ x^T \Sigma^{-1} x = x^T (\Lambda^{1/2} Q)^T (\Lambda^{1/2} Q) x = (\Lambda^{1/2} Q x)^T (\Lambda^{1/2} Q x) = \|\Lambda^{1/2} Q x\|^2 $$
What this formula says is that this quadratic form is equivalent to the standard Euclidean distance in new, morphed space. Namely, we apply the transformation $\Lambda^{1/2} Q$ to transform vector $x$, and then take the norm.
What does this transformation do to our $x$? Well, it consists of 2 "sub-transformation": the first one is a multiplication by $Q$, which just rotates the vector (this is what orthogonal matrices do), preserving its norm; and the second one is a multiplication by $\Lambda^{1/2}$ which stretches vectors along each axes in proportion to the square roots of diagonal elements of $\Lambda$.
The picture above shows what will happen with your 2D datapoints if you apply this transformation (for some $\Sigma$). You can think of Mahalanobis distance as of distance, which implicitly maps your vectors into a new space, measures its norm there and returns back to you.