When performing the linear SVM classification, it is often helpful to normalize the training data, for example by subtracting the mean and dividing by the standard deviation, and afterwards scale the test data with the mean and standard deviation of training data. Why this process changes dramatically the classification performance?
Solved – Why scaling is important for the linear SVM classification
machine learningstandardizationsvm
Related Question
- Solved – Why does SVM-linear kernel outperforms SVM-RBF kernel in terms of classification accuracy
- Solved – Feature scaling in svm: Does it depend on the Kernel
- Solved – When I normalize the standardized values of data, why do I get the same values as just normalizing
- Solved – Why center the data during feature scaling for neural network
Best Answer
SVM tries to maximize the distance between the separating plane and the support vectors. If one feature (i.e. one dimension in this space) has very large values, it will dominate the other features when calculating the distance. If you rescale all features (e.g. to [0, 1]), they all have the same influence on the distance metric.