Discriminant Analysis – Deriving Total (Within Class + Between Class) Scatter Matrix

discriminant analysis

I was fiddling with PCA and LDA methods and I am stuck at a point, I have a feeling that it is so simple that I can't see it.

Within-class ($S_W$) and between-class ($S_B$) scatter matrices are defined as:

S_W = \sum_{i=1}^C\sum_{t=1}^N(x_t^i – \mu_i)(x_t^i – \mu_i)^T

S_B = \sum_{i=1}^CN(\mu_i-\mu)(\mu_i-\mu)^T

Total scatter matrix $S_T$ is given as:

S_T = \sum_{i=1}^C\sum_{t=1}^N(x_t^i – \mu)(x_t^i – \mu)^T = S_W + S_B

where C is number of classes and N is number of samples $x$ are samples, $\mu_i$ is ith class mean, $\mu$ is overall mean.

While trying to derive $S_T$ I came up to a point where I had:

(x-\mu_i)(\mu_i-\mu)^T + (\mu_i-\mu)(x-\mu_i)^T

as a term. This needs to be zero, but why?


S_T &= \sum_{i=1}^C\sum_{t=1}^N(x_t^i – \mu)(x_t^i – \mu)^T \\
&= \sum_{i=1}^C\sum_{t=1}^N(x_t^i – \mu_i + \mu_i – \mu)(x_t^i – \mu_i + \mu_i – \mu)^T \\
&= S_W + S_B + \sum_{i=1}^C\sum_{t=1}^N\big[(x_t^i – \mu_i)(\mu_i – \mu)^T + (\mu_i – \mu)(x_t^i – \mu_i)^T\big]

Best Answer

If you assume




and formula holds. You deal with the second term in the similar way.