Logistic regression does not have decision boundaries. It is a method to estimate probabilities of events/class membership. Decisions are made in a separate step once you know the estimated risk along with utilities/costs/loss function, which is the way optimum decisions are made.
This is actually straightforward. We think of statistical models specifying a conditional response distribution, which is stochastic, but once you are working with the fitted model, it is just a deterministic function. In this case, a logistic regression model specifies the conditional parameter $\pi$ that governs the behavior of a binomial distribution. That is:
$$
\ln\bigg(\frac{\pi}{(1-\pi)}\bigg) = \beta_0 + \beta_1X_1 + \beta_2X_2
$$
With respect to assigning predicted classes, the most intuitive thing to do is call an observation a 'success' if $\hat\pi_i>.5$ or a 'failure' if not. (Note that using $.5$ as your threshold will not necessarily maximize the accuracy of a given model, and that any conversion from predicted probabilities to predicted classes throws away a lot of information—probably unnecessarily.) Using $.5$ on the probability scale corresponds to using $0$ on the log odds (linear) scale. If we only want to know the set of all points in the $X_1$, $X_2$ space that correspond to a predicted log odds of $0$, we can set the fitted model equal to $0$ and then algebraically rearrange the equation to make one variable a function of the other. (In the example, weight
as a function of height
.) That's just algebra. Once you have that, you can plot the decision boundary on the $X_1$, $X_2$ (height, weight
) plane.
To solve for weight when height is $0$:
\begin{align}
0 &= \hat\beta_0 + \hat\beta_10 + \hat\beta_2{\rm weight} \\[8pt]
-\hat\beta_0 &= \hat\beta_2{\rm weight} \\[8pt]
\frac{-\hat\beta_0}{\hat\beta_2} &= \text{weight (i.e., the intercept)} \\[20pt]
\end{align}
To solve for the increase in weight when height goes up by $1$ unit (inch), let's use two points, where height equals $0$ and where height equals $1$. (Since it's a straight line, any two points would do, but these are convenient.) Then:
\begin{align}
0 &= \hat\beta_0 + \hat\beta_1{\rm height}_1 + \hat\beta_2{\rm weight}_1 \\[8pt]
&\quad -(\hat\beta_0 + \hat\beta_1{\rm height}_0 + \hat\beta_2{\rm weight}_0) \\[8pt]
0 &= \hat\beta_0 - \hat\beta_0 + \hat\beta_1{\rm height}_1 - \hat\beta_1{\rm height}_0 + \hat\beta_2{\rm weight}_1 - \hat\beta_2{\rm weight}_0 \\[8pt]
0 &= \hat\beta_1 + \hat\beta_2\Delta{\rm weight} \\[8pt]
\frac{-\hat\beta_1}{\hat\beta_2} &= \Delta{\rm weight} \text{ (i.e., the slope)} \\
\end{align}
Best Answer
I must remark that perfect separation occurs here, therefore the
glm
function gives you a warning. But that is not important here as the purpose is to illustrate how to draw the linear boundary and the observations colored according to their covariates.