SVM – Interpreting Linear SVM Weights in Binary Classification: Class Significance

classificationinterpretationsvmweights

I'm trying to interpret the weights of a linear svm which I use to classify elements in my dataset of patients into two classes: alzheimer and non-alzheimer. From this post I understand that the value of each weight can be interpreted as: "how much a feature contributes in the classification". I'm interest in understanding how to correlate the sign of the weights to the classification label: in what way do I infer that positive weights are associated to the alzheimer label and negative weights to the non-alzheimer label, and not vice-versa? I also read this post but it seems to me his reasoning is based on having two numerical labels, one positive and one negative…

Best Answer

Positive and negative weights are both associated with AD and non AD labels. If you code AD as 1 and non AD as 0, then positive weights are associated positively and negative negatively with the AD label and vice versa with the non AD label. I.e., positive weight means that the larger that variable is, the higher chance that a subject will be classified as AD, and negative weight means that the lower that variable is, the higher the chance of AD classification.

That being said I am not a huge fan of interpreting weights from ML models, see for example Haufe et al., On the interpretation of weight vectors of linear models in multivariate neuroimaging https://www.sciencedirect.com/science/article/pii/S1053811913010914