Machine Learning – How Linear Discriminant Analysis Reduces Dimensions

discriminant analysismachine learning

There are words from "The Elements of Statistical Learning" on page 91:

The K centroids in p-dimensional input space span at most K-1 dimensional subspace, and if p
is much larger than K, this will be a considerable drop in dimension.

I have two questions:

  1. Why the K centroids in p-dimensional input space span at most K-1 dimensional subspace?
  2. How are the K centroids located?

There is no explain in the book and I didn't find the answer from related papers.

Best Answer

Discriminants are the axes and the latent variables which differentiate the classes most strongly. Number of possible discriminants is $min(k-1,p)$. For example, with k=3 classes in p=2 dimensional space there can exist at most 2 discriminants such as on the graph below. (Note that discriminants are not necessarily orthogonal as axes drawn in the original space, although they, as variables, are uncorrelated.) The centroids of the classes are located within the discriminant subspace according to the their perpendicular coordinates onto the discriminants.

enter image description here

Algebra of LDA at the extraction phase is here.