Solved – Feature scaling/normalization and prediction

machine learningmultidimensional scalingnormalizationpredictionregression

I have a dataset which I have split into a training and a test set.

I have thereafter applied normalization on the training set and saved the mean (U) and standard deviation (SD) estimated based on the training set.

Questions:

  1. If I apply an algorithm e.g. linear regression how can I apply the
    coefficients on the test set? Should i normalize the test set using
    the same U and SD as above and thereafter apply the coefficients?
  2. If I also normalize the prediction variable (Y), how can i calculate
    the "real" prediction (unnormalized). Would it be (Y_norm+U)*SD?

Thank you

Best Answer

  1. Yes you should normalize using the same values from the train set
  2. You should not normalize the prediction variable, as it makes no sense to do that. The reason we normalize the features is that they are treated on an equal scale.Once they are normalized , predicting any value will be equally learnable.