Is there an advantage to using higher dimensions (2D, 3D, etc) or should you just build x-1 single dimension classifiers and aggregate their predictions in some way?
This depends on whether your features are informative or not. Do you suspect that some features will not be useful in your classification task? To gain a better idea of your data, you can also try to compute pairwise correlation or mutual information between the response variable and each of your features.
To combine all (or a subset) of your features, you can try computing the L1 (Manhattan), or L2 (Euclidean) distance between the query point and each 'training' point as a starting point.
Since building all of these classifiers from all potential combinations of the variables would be computationally expensive. How could I optimize this search to find the the best kNN classifiers from that set?
This is the problem of feature subset selection. There is a lot of academic work in this area (see Guyon, I., & Elisseeff, A. (2003). An Introduction to Variable and Feature Selection. Journal of Machine Learning Research, 3, 1157-1182. for a good overview).
And, once I find a series of classifiers what's the best way to combine their output to a single prediction?
This will depend on whether or not the selected features are independent or not. In the case that features are independent, you can weight each feature by its mutual information (or some other measure of informativeness) with the response variable (whatever you are classifying on). If some features are dependent, then a single classification model will probably work best.
How do most implementations apply kNN to a more generalized learning?
By allowing the user to specify their own distance matrix between the set of points. kNN works well when an appropriate distance metric is used.
The clustering itself has no problems with the p>n situation, however the visualization internally uses princomp
(which is incapable of handling p>n) to plot the similarity space projection.
You can't fix that, at most try to reproduce similar graph by obtaining similarity space projection with cmdscale(dist(...))
and coloring the points with clusters.
Best Answer
I think your question has little to do with KNN. Really, the question is how to best visualize multi-dimensional data. The best ways to do that depend on your data. Are the variables continuous or categorical? How many data points are there? Are there special variables you want to understand? I think that the main impact of KNN in your question is regarding this latter question; you particularly want to understand the predicted variable as it relates to the predictors.
I will give some examples that may help for the case that you are predicting a categorical variable based on several continuous variables using the iris data set. I hope that this may help with your data.
Scatterplot matrix - shows all pairs of predictor dimensions with the category represented by the color of the points. Obviously, when you have many predictor variables, this can be difficult to see, but with a modest number of variables you can get a decent overview.
Parallel Coordinates plot - shows all predictor dimensions simultaneously with category represented by color. With many data points, these can get pretty messy.
Projection: There are many ways to project the data to a "good" two dimensional space. For example, PCA, tSNE and isomap. I will just illustrate with PCA.
Principal Components Analysis - Biplot projects to the first two principal components and color based on the category. This is a linear projection to the two directions that show the most variation in the data.