PCA – Plotting a Discriminant Line on a Scatterplot in Discriminant Analysis

discriminant analysispcascatterplot

Given a data scatterplot I can plot the data's principal components on it, as axes tiled with points which are principal components scores. You can see an example plot with the cloud (consisting of 2 clusters) and its first principle component. It is drawn easily: raw component scores are computed as data-matrix x eigenvector(s); coordinate of each score point on the original axis (V1 or V2) is score x cos-between-the-axis-and-the-component (which is the element of the eigenvector).

1st principal component tiled by its scores

My question: Is it possible somehow to draw a discriminant in a similar fashion? Look at my pic please. I'd like to plot now the discriminant between two clusters, as a line tiled with discriminant scores (after discriminant analysis) as points. If yes, what could be the algo?

Best Answer

OK, since nobody answered I think that, after some experimentation, I can do it myself. Following discriminant analysis guidelines, let T be the whole cloud's (data X, of 2 variables) sscp matrix (of deviations from cloud's centre), and let W be the pooled within-cluster sscp matrix (of deviations from a cluster centre). B=T-W is the between-cluster sscp matrix. Singular value decomposition of inv(W)B yields us U (left eigenvectors), S (diagonal matrix of eigenvalues), V (right eigenvectors). In my example of 2 clusters only the 1st eigenvalue is nonzero (which means that there is only one discriminant), and so we use only the 1st eigenvector (column) of U: U(1). Now, XU(1) are the sought-for raw discriminant scores. To show the discriminant as a line tiled with those, multiply the scores by cos-between-the-axis-and-the-discriminant (which is the element of the eigenvector U(1)) - just as did it with principal component above. The resulting plot is below.

enter image description here