Solved – Is it a good idea to normalize the data consecutively in two different methods

classificationdata visualizationmachine learningmultilabelnormalization

Let's say I have a dataset with 5 features. One row is (just to give you an idea of the range of data in each column)
[200456, 76, 2, 1, 0, 9986]
First I normalize columns with mean = 0 and variance = 1. Then I scale the columns of the data from 0 to 1. This gives me really good results, but in theory, I am not very convinced that this is a good method of normalization. I am working with multi-label classification of data if that helps. Kindly let me know if more information is required.

Best Answer

The part about normalizing across rows pops out at me. It's usual to normalize a feature (column) so that, having done this for each feature, the features will be on more comparable scales. Normalizing across rows probably won't make any physical sense, and I'm not sure I can see any situation where it would be justified. (Imagine mashing a person's height, weight, and blood pressure together.)

Even if you normalize only the columns, note: if you normalize all of your data and then split it into train/test, you will get unrealistically better test results than you should. Your training data represents the data you have before you deploy your model and your test data represents the data that comes in after deployment. By normalizing across this boundary, you are allowing data from the future (test set) to leak into the present (training set). This can't and won't happen in the real world.