Logistic Regression – Relationship Between Regressing Y on X and X on Y

logisticregressionregression coefficients

Correlation and linear regression are sometimes distinguished in statistics books by saying that the former is symmetric and the latter is asymmetric in the following sense: in the case of correlation, no distinction is made between dependent and independent variables, whereas it makes a difference which variables are treated as dependent and independent variables in a regression equation. To be sure, it is possible to treat $Y$ as a predictor of $X$ instead of treating $X$ as a predictor of $Y$ by using the following equations:

$X$ as a predictor of $Y$:
\begin{align}
\text{Slope: }b_1 &= r_{XY} \cdot \frac{(sY)}{(sX)} \\
~\\
~\\
\text{Intercept: }b_0 &= \bar{y} – b_1 \bar{x}
\end{align}
$Y$ as a predictor of $X$:
\begin{align}
\text{Slope: }b_1 &= r_{XY} \cdot \frac{(sX)}{(sY)} \\
~\\
~\\
\text{Intercept: }b_0 &= \bar{x} – b_1 \bar{y}
\end{align}
However, the regression lines to which the scatter plot will be fitted will differ depending on whether $X$ is treated as a predictor of $Y$ or $Y$ is treated as a predictor of $X$.

My question is whether something similar holds in the case of logistic regression and whether it is possible to formulate similar equations for the regression coefficients in logistic regression for when $Y$ is used as a predictor of $X$.

Best Answer

Yes, there is a similar relationship: for circumstances where it makes sense and where both variables are coded by $0$ and $1$ (the analog of standardization), the "slope" in the logistic regression of $Y$ against $X$ equals the slope in the logistic regression of $X$ against $Y$.

Recall that (univariate) logistic regression models a binary response $Y$ in terms of a variable $x$ and a constant, using two parameters $\beta_0$ and $\beta_1$, by stipulating that the chance of $Y$ equaling one of its values (generically termed "success") can be modeled by

$$\mathbb O(Y=\text{success}) = \beta_0 + \beta_1 x$$

where "$\mathbb O$" refers to the log odds, equal to the logarithm of the odds $\Pr(\text{success}) / \Pr(\text{not success})$.

The only circumstance under which it makes sense to switch the roles of $Y$ and $x$, then, is when $x$ also is binary. That compels us to view its outcomes now as draws from a random variable $X$. The values of $Y$ must be encoded as fixed (nonrandom) values $1$ for "success" and $0$ otherwise. We might as well assume, then, that the encoding $1$="success" and $0$="not success" has been used all along for both variables.

Notice that the data in this situation can be considered a two-by-two contingency table in which the counts of all four possible combinations of $x$ and $y$ are displayed. Let the counts for $x=i$ and $y=j$ be written $n_{ij}$, for $i=0,1$ and $j=0,1$.

The conventional estimator of the parameters is obtained in the maximum likelihood theory by finding values for which the gradient of the log likelihood equals zero. In the first case, viewing $Y$ as the dependent variable, the likelihood equations are

$$\cases { 0 = n_{01} + n_{11} - \frac{n_{00}+n_{10}}{1+\exp{\beta_0}} - \frac{n_{10}+n_{11}}{1+\exp(\beta_0+\beta_1)} \\ 0 = n_{11} - \frac{n_{10} + n_{11}}{1+\exp(\beta_0+\beta_1)} }$$

When all the $n_{ij}\ne 0$ the solution is

$$\cases{ \beta_0 = \log(n_{00}) - \log(n_{01}),\\ \beta_1 = \log(n_{01}) + \log(n_{10}) - \log(n_{00}) - \log(n_{11}).}$$

Switching the roles of the variables merely permutes the subscripts of the $n$'s (although now $\beta_0$ and $\beta_1$ have different meanings, for they multiply the $y$ values instead of the $x$ values). But the symmetry of the solution for $\beta_1$ shows that it remains unchanged. This is the "slope" term and it is the perfect analog of the regression coefficient in ordinary least squares regression.


Example

Software will confirm this result. Here, for instance, are the results of the two logistic regressions in R using the following two-way table:

    Y=0   Y=1
X=0:  1     3
X=1:  2     4

Regressing $Y$ against $X$ gives $(\hat\beta_0, \hat\beta_1)$ = $(\log(1/3), \log(3/2))$ = $(-1.0986, 0.4055)$ while regressing $X$ against $Y$ gives $(\hat\beta_0, \hat\beta_1)$ = $(\log(1/2), \log(3/2))$ = $(-0.6931, 0.4055)$.

y <- matrix(c(1,2,3,4),nrow=2)
(fit <- glm(y ~ as.factor(0:1), family=binomial))
(fit.t <- glm(t(y) ~ as.factor(0:1), family=binomial))

The output suggests that both the slopes and the null deviances remain the same upon switching $X$ and $Y$:

Coefficients:
    (Intercept)  as.factor(0:1)1  
        -1.0986           0.4055  

Degrees of Freedom: 1 Total (i.e. Null);  0 Residual
Null Deviance:      0.08043 
Residual Deviance: 2.22e-16     AIC: 7.948 


Coefficients:
    (Intercept)  as.factor(0:1)1  
        -0.6931           0.4055  

Degrees of Freedom: 1 Total (i.e. Null);  0 Residual
Null Deviance:      0.08043 
Residual Deviance: 4.441e-16    AIC: 8.072 
Related Question