Solved – Choosing variables for Discriminant Analysis

discriminant analysisfeature selection

I've 110 variables & 200 data points. Of this 110 variables, one is group variable (say "brown eye","blue eye"). I want to use discriminant analysis to classify the groups based on remaining 119 variables. Since the variables are large, to get a meaningful result I need to reduce the number of variables. So, the are 3 options to me:

1) Stepwise Discriminant Analysis: I don't want to use this method as I'm biased towards it.

2) Classification Tree Method: This method will give an idea about which variables affecting the eye color. Since the dataset is small, I'm apprehensive of using this method.

3) Principal Component Method: This method I can use. But I prefer to keep the original variables.

My question is can anybody please suggest me some other method to select variables for discriminant analysis.

Best Answer

You can get rid of some by looking for pairs that are very highly correlated and randomly deleting one of the pair.

Then you can look at partial least squares, and pick variables that are important in the PLS solution.

I did this with a similar problem and it worked pretty well (that is, the resulting discriminant function did pretty well)

Related Question