Solved – If you standardize X, must you always standardize y

linearmultiple regressionstandardization

Related reading:

When conducting multiple regression, when should you center your predictor variables & when should you standardize them?
When and how to use standardized explanatory variables in linear regression
Variables are often adjusted (e.g. standardised) before making a model – when is this a good idea, and when is it a bad one?
Follow-up question: When should you center your data & when should you standardize?

Background:

I am comparing the effectiveness of various forms of linear regression machine learning, such as sklearn.linear_model.Ridge, sklearn.linear_model.Lasso, sklearn.svm.SVR.

Question:

The linked questions above discuss various reasons to standardize, center, or neither the predictor variables in regression settings. If I standardize the X matrix do I have to then standardize the y array? If I center the X matrix do I have to center the y array?

For either of those situations, would failing to standardize/center give me incorrect results?

Best Answer

NO, just because you standardized the predictors $X$ do not force you to standardize the response $y$. Ask yourself "Why do I standardize?" and see what the standardization is doing. Some answers to that can be found at: What algorithms need feature scaling, beside from SVM? As to the additional question in comments: The arguments in my answer linked at above do also apply for ridge and lasso. The arguments to standardize $X$ in those cases do not apply to $y$ (but if you want you can standardize $y$ too, it does no harm, but can complicate interpretations). The same principles apply to SVR, but I do not know the answer in that case.

Best Answer

Related Solutions

Solved – When conducting multiple regression, when should you center your predictor variables & when should you standardize them

Solved – How to compare coefficients of a negative binomial regression for determining relative importance

Related Question