Solved – Effect of scaled down variables in Logistic regression

logistic

I have a logistic regression model with 5 continuous independent variables and a categorical variable.

I scaled down one of the continuous variable using the formula
Scaled_Value = NewMin_value + Ratio between new and old Range * (Value – Old_Min _Value)

I checked the shape of scaled version of the variable using skewness and kurtosis. They were same as the original variable.

Next, I fitted logistic reg model again with all the variables – 4 continuous + 1 categorical + scaled version of continuous variable instead of the original variable.

Compared to the first model,
I see that scores and coefficients for all variables are same except for the intercept and coefficient of scaled variable which have now changed. Rest are all same.

I'm trying to understand the reason behind this? How the scores remained unchanged. And why intercept and coefficient of scaled variable has to change. THanks.

Best Answer

Your scaled variable, denote it $Z_1$, is an affine function of the original variable,

$$Z_1 = c_0 + c_1X_1$$ where $c_0,c_1$ are constants and determined by the equation you describe in your question.

In logistic regression, we are estimating a conditional probability for which we have assumed a specific functional form

$$P(Y\mid X_1,..X_k) = \frac 1{1+e^{-g(\mathbf X)}}, \;\;g(\mathbf X) = b_0+b_1X_1+...+b_kX_k $$

You inserted $Z_1$ instead of $X_1$. So in this case you essentially specified

$$g(\mathbf X_{-1}, Z_1) = b_0+b_1(c_0 + c_1X_1)+...+b_kX_k $$ $$= (b_0+b_1c_0) + b_1c_1X_1+...+b_kX_k = d_0 + d_1X_1+...+b_kX_k$$

In other words you are back to the original regressor matrix, but with a different intercept, and a different coefficient for the scaled variable. I hope this helps.

Related Question