Solved – Equal Covariance in Linear Discriminant Analysis

classificationcovariancecovariance-matrixlogisticmachine learning

In an online course, we are working through some linear discriminant analysis and I've been given an example. I am having trouble with the language used by the professor as it seems I am misunderstanding the assumptions regarding covariance.

The exact language for the example is as follows:

"Your data is from two classes and they are both Gaussian distributed. They have the same covariance but different means, as you can see from this picture (below)."

Gaussian PDF

My question arises with respect to the covariance. Below is what I have written to the instructor:

Is there a way that two variables in a bivariate gaussian could NOT
have the same covariance? My understanding is that they would
necessarily have the same covariance since Cov(X,Y) = Cov(Y,X).

Are we also assuming in this situation that each variable has similar
variance so that var(x) = var(y) as well (and as such, their
covariance matrices are exactly equal)? Or are we only assuming their
covariances to be equal, which must necessarily be true?

I generally would not post a reply from an instructor publicly but I am in great need of clarification. Here is the reply I received:

Here's a picture of a basic classification problem (below):

enter image description here

Is there a way that two variables in a bivariate gaussian could NOT have the same covariance?

I think maybe it's that you're not understanding what covariance we're
looking at?

For example here we can see that these 2 groups have a different
amount of "spread".

The bottom group is spread out more.

It should be clear visually that their covariances are not equal.

Are we also assuming in this situation that each variable has similar variance so that var(x) = var(y) as well

I think perhaps it's due to a misunderstanding of what covariance is.

In fact, the variances of each independent variable are just the
entries along the diagonal of the covariance matrix.

has similar variance so that var(x) = var(y) as well (and as such, their covariance matrices are exactly equal)?

You have the implication backwards.

A valid sentence is: "their covariances are equal, therefore, the
variance of each component is equal"

The reverse is not true.

I followed up with another question to the instructor but it seems I must be misunderstanding the meaning of the problem statement.

My understanding is that a covariance matrix is always symmetric. However, the entries along the DIAGONAL don't have to be equal, since the variances along the diagonal can all be unique. I can't tell if he's trying to say we are meant to assume that they have the exact same covariance matrix, which would imply that the VARIANCE of samples from each class would be equal.

I also note that the last statement in the reply very much confuses me. If two variables have different variances, but are independent, their covariances will both be zero, thus different variances can easily lead to the same covariance. But cov(x,y) = cov(y,x) always, so as I understand, equal covariances does not imply anything about equal variances.

Note: I have studied some LDA from other sources as well and believe that the VARIANCE of the two classes does need to be assumed unique. When he refers to the classes having the "same covariance," am I to understand this is to mean they have the exact same covariance MATRIX and, as such, must have exact same variance? If that is the case, can anyone illuminate what I'm misunderstanding about the definition or meaning of covariance?

Best Answer

Your confusion arises from the fact that there are two different populations on the same multidimensional space. To clarify, let's play with a concrete example.

We have two populations $\mathcal{A}$ (people in Argentina) and $\mathcal{B}$ (people in Brazil). Each is described using the two same features $X,Y$ ($X$ - height, $Y$ - weight).

Now, in general $Cov_{\mathcal{A}}\left(X,Y\right) \neq Cov_{\mathcal{B}}\left(X,Y\right)$. That is, the relationship between height and weight in Argentina might be different than the relationship in Brazil. This case is what the instructor tried to emphasize. However, in the original question, we assume equality instead.

You should note that the covariance matrix for Argentina in our case is the following symmetric positive definite matrix: $$ \Sigma_{\mathcal{A}} = \begin{pmatrix} Var_{\mathcal{A}}\left(X\right) & Cov_{\mathcal{A}}\left(X,Y\right) \\ Cov_{\mathcal{A}}\left(Y,X\right) & Var_{\mathcal{A}}\left(Y\right)\end{pmatrix} $$ Finally, it doesn't make much sense to talk about the covariance between population $\mathcal{A}$ and population $\mathcal{B}$. Covariance can be calculated only between random variables taken from the same multivariate distribution.

Related Question