Solved – Why scaling is important for the linear SVM classification

machine learningstandardizationsvm

When performing the linear SVM classification, it is often helpful to normalize the training data, for example by subtracting the mean and dividing by the standard deviation, and afterwards scale the test data with the mean and standard deviation of training data. Why this process changes dramatically the classification performance?

Best Answer

SVM tries to maximize the distance between the separating plane and the support vectors. If one feature (i.e. one dimension in this space) has very large values, it will dominate the other features when calculating the distance. If you rescale all features (e.g. to [0, 1]), they all have the same influence on the distance metric.

Best Answer

Related Solutions

Solved – Inputs to k-means are often normalized per-feature. Why not fully whiten the data instead

Solved – using the SVM for sound classification

Related Question