Solved – How to implement multiclass k-NN

k nearest neighbourmachine learning

I want to implement k-NN to use in a multi-class dataset. I found "A k-Nearest Neighbor Based Algorithm for Multi-label Classification" but didn't get the algorithm.

Do you know any clear explanation of it?

What I have in mind it works as follows:

  1. Calculate posterior probabilities for each class (simply by dividing number of samples who are labelled as class_i to the number of total samples).

  2. Then, for each test sample find it's nearest neighbours (e.g. assume k = 5, so we find top 5 nearest neighbour among all samples in the training set).

  3. Then take majority class among these 5 neighbours as the class of the test sample (so if 3 of the 5 nearest neighbour has the Classx, then we'll classify the test sample as Classx).

Would that be a correct implementation?

Best Answer

It depends on what you mean by multi class--are you talking of a setting where (a) one item can have multiple classes or (b) an item can have one of many classes (as opposed to binary classification)?

From your proposal algorithm I take that you mean (b).

In that case, your idea is right, except that there is no need to take the prior (I think you mean prior instead of posterior probabilities in 1.) into account. That's typically not done, although I have no solid argument for that right now.

Related Question