Solved – How to calculate logistic regression coefficients manually

estimationlogisticnonlinear regressionregression coefficients

For two independent variables, what is the method to calculate the coefficients for any dataset in logistic regression? The equation we know is => logit = ln(P/1-P) = B0 + B1 * X1 + B2 * X2

On the below dataset, how do we calculate the above X1 and X2 values

Y X1 X2
0 2 30
0 6 50
1 8 60
1 10 80

Best Answer

Unlike linear regression, where you can use matrix algebra and ordinary least squares to get the results in a closed form, for logistic regression you need to use some kind of optimization algorithm to find the solution with smallest loss, or greatest likelihood. For this, logistic regression most commonly uses the iteratively reweighted least squares, but if you really want to compute it by hand, then it would be probably easier to use gradient descent. You can find nice introduction to gradient descent in the lecture Lecture 6.5 — Logistic Regression | Simplified Cost Function And Gradient Descent by Andrew Ng (if it is unclear, see earlier lectures, also available on YouTube or on Coursera). Using his notation, the iteration step is

$ \mathtt{repeat} \, \{\\ \qquad\theta_j := \theta_j - \alpha\, (h_\theta(x^{(i)}) - y^{(i)}) \,x_j^{(i)}\\ \}$

where $\theta_j$ is the $j$-th parameter from the vector $(\theta_0, \theta_1, \dots, \theta_k)$, $x^{(i)}$ is the vector of variables for the $i$-th observation $(1, x_1^{(i)},\dots, x_k^{(i)})$, where $1$ comes from the column of ones for the intercept, and the inverse of the logistic link function is $h_\theta(x) = \tfrac{1}{1+\exp(\theta^T x)}$ and $\alpha$ is the learning rate. You iterate until convergence.

Related Question