Solved – Choosing the number of principal components to retain before training a neural network for classification

classificationneural networkspattern recognitionpca

I am working on neural networks and I am currently creating a perceptron that will work as a classifier for a data set of images with faces. I am required to perform pca (principal component analysis) to my data set before dividing the samples into two different sets for training and testing. By doing this, I am lowering the dimensionality of the data and at the same time I am compressing the size of the images.

However, I am not a statistician and I have some problems defining the number of principal components to use for the pca method without any specific formula. My data set is an array of 4096×400, 400 being the number of the sample images and 4096 being their dimension.
Is there a way to be more precise and accurate about the number of principal components to use during pca?

I am working on matlab so I am using princomp.

Best Answer

The standard procedure is to calculate the PCA decomposition, and to keep those components which account for, say, 95% of the total variance. In other words, you calculate the eigenvalues and order them decreasingly. Keep those components for which the sum of the corresponding eigenvalues accounts for 95% of the total.

If you would like to go for a more principled approach, you may consider probabilistic PCA, which has some additional advantages: you can handle missing values and is more robust when working with high dimensional data and relatively small data sets.

An extension of this idea (Bayesian PCA) allows you to select the number of components from the data automatically.