Solved – Cluster Analysis followed by Discriminant Analysis

clusteringdiscriminant analysis

What is the rationale, if any, to use Discriminant Analysis (DA) on the results of a clustering algorithm like k-means, as I see it from time to time in the literature (essentially on clinical subtyping of mental disorders)?

It is generally not recommended to test for group differences on the variables that were used during cluster construction since they support the maximisation (resp. minimisation) of between-class (resp. within-class) inertia. So, I am not sure to fully appreciate the added value of predictive DA, unless we seek to embed individuals in a factorial space of lower dimension and get an idea of the "generalizability" of such a partition. But even in this case, cluster analysis remains fundamentally an exploratory tool, so using class membership computed this way to further derive a scoring rule seems strange at first sight.

Any recommendations, ideas or pointers to relevant papers?

Best Answer

I don't know of any papers on this. I've used this approach, for descriptive purposes. DFA provides a nice way to summarize group differences and dimensionality with respect to the original variables. One might more easily just profile the groups on the original variables, however, this loses the inherently multivariate nature of the clustering problem. DFA allows you to describe the groups while keeping the multivariate character of the problem intact. So, it can assist with the interpretation of the clusters, where that is a goal. This is particularly ideal when there is a close relationship between your clustering method and your classification method--e.g., DFA and Ward's method.

You are right about the testing problem. I published a paper using the Cluster Analysis with DFA follow-up to describe the clustering solution. I presented the DFA results with no test statistics. A reviewer took issue with that. I conceded and put the test statistics and p values in there, with the disclaimer that these p-values should not be interpreted in the traditional manner.

Related Question