Logistic Regression vs. LDA – Two-Class Classifiers Explained

classificationdiscriminant analysislogisticregression

I am trying to wrap my head around the statistical difference between Linear discriminant analysis and Logistic regression. Is my understanding right that, for a two class classification problem, LDA predicts two normal density functions (one for each class) that creates a linear boundary where they intersect, whereas logistic regression only predicts the log-odd function between the two classes, which creates a boundary but does not assume density functions for each class?

Best Answer

It sounds to me that you are correct. Logistic regression indeed does not assume any specific shapes of densities in the space of predictor variables, but LDA does. Here are some differences between the two analyses, briefly.

Binary Logistic regression (BLR) vs Linear Discriminant analysis (with 2 groups: also known as Fisher's LDA):

  • BLR: Based on Maximum likelihood estimation.
    LDA: Based on Least squares estimation; equivalent to linear regression with binary predictand (coefficients are proportional and R-square = 1-Wilk's lambda).

  • BLR: Estimates probability (of group membership) immediately (the predictand is itself taken as probability, observed one) and conditionally.
    LDA: estimates probability mediately (the predictand is viewed as binned continuous variable, the discriminant) via classificatory device (such as naive Bayes) which uses both conditional and marginal information.

  • BLR: Not so exigent to the level of the scale and the form of the distribution in predictors.
    LDA: Predictirs desirably interval level with multivariate normal distribution.

  • BLR: No requirements about the within-group covariance matrices of the predictors.
    LDA: The within-group covariance matrices should be identical in population.

  • BLR: The groups may have quite different $n$.
    LDA: The groups should have similar $n$.

  • BLR: Not so sensitive to outliers.
    LDA: Quite sensitive to outliers.

  • BLR: Younger method.
    LDA: Older method.

  • BLR: Usually preferred, because less exigent / more robust.
    LDA: With all its requirements met, often classifies better than BLR (asymptotic relative efficiency 3/2 time higher then).