# Solved – Using the eigenvalues from PCA in k-nearest-neighbours

classificationk nearest neighbourpcaweighted mean

I'm quite new to this StackExchange, only been a lurker till now, but my StackOverflow fellows have said you'd be the best people to ask about this.

Anyway, enough introduction. I'm using the weighted k-Nearest-Neighbours algorithm. My original data set has 37 features. I've looked into using PCA to reduce dimensionality, and I'm going to follow this method.

For simplicity's sake, let us assume that two of the new features created account for 90% of the variance and I'm only going to use these two new features. Let us call them feature 1 and feature 2 ($f_1$, $f_2$). Let us say that $f_1$ accounts for 60% of the variance and $f_2%$ accounts for 30% of the variance. I know wish to select the weights ($w_1 , w_2$) for these two features. My initial intuition is that we could correlate the variance accounted for with the weight of the feature. Therefore, I would use a weight combination of $w_1 = 0.6$ and $w_2 = 0.3$ in my k-Nearest-Neighbours algorithm.

I am well aware that there is much literature suggesting the best way to select weights would be to use a lattice type of method where we select different combinations of weights and then follow through with combination that yields the best results. I was just wondering if the intuition of weights being related to total variance accounted for was. Also, as my dataset actually requires 11 features to be used to account for 90% of the variance, I'd like to have a starting point for determining combinations of weights.

Summary: When using PCA as a precursor for kNN, is it possible to base the weights of features in k-NN on the total variance said features accounts for in the data?

Sorry if there are any formatting errors or if I'm breaking any protocols. Let me know if I have, and I will update the post.

The main idea is that you intend to use a weighted Euclidean metric: $$D(x_1,x_2)=\sqrt{(x_1-x_2)^TC(x_1-x_2)}$$ $$C=diag(w_1...w_n)$$ The Mahalanobis distance is similarly defined, although it takes into account the cross correlation covariance between the variables (some of your features may be correlated) $$D_M(x_1,x_2)=\sqrt{(x_1-x_2)^TC(x_1-x_2)}$$