I am new to machine learning and I am studying classification at the moment. Is it correct to say that logistic regression with more than two classes (i.e. multinomial regression) and LDA are both methods for classifying new points when having more than two classes? Or are they different things and it’s not fair to compare them? I know that in LDA we use Bayes theorem and we suppose that the distribution of the predictors is approximately normal in each of the classes and is more stable than the logistic regression model, but do they do the same thing in the end? Sorry if the question sounds silly, but I am majorly confused.
Solved – the difference between multinomial logistic regression and linear discriminant analysis
discriminant analysislogistic
Related Solutions
Binary or Multinomial: Perhaps the following rules will simplify the choice:
- If you have only two levels to your dependent variable then you use binary logistic regression.
- If you have three or more unordered levels to your dependent variable, then you'd look at multinomial logistic regression.
A few points:
- Satisfaction with sexual needs ranges from 4 to 16 (i.e., 13 distinct values). Such a variable is typically treated as a metric predictor (i.e., in the covariate box in SPSS).
- Possibly your dependent variable is causing some confusion because as you phrase it, it is not a standard dichotomy. It sounds like a frequency item that could range from never, to occasionally, to sometimes, to often, to always, etc. However, I'm guessing that either you have explicitly collapsed categories or you have required the respondent to implicitly collapse the categories down to a binary choice. As a side note, if you did have an ordered set of frequency categories, then you might want to use a model that incorporated that order.
SPSS: I posted some links to tutorials in SPSS and R for conducting binary logistic regression.
I take it that the question is about LDA and linear (not logistic) regression.
There is a considerable and meaningful relation between linear regression and linear discriminant analysis. In case the dependent variable (DV) consists just of 2 groups the two analyses are actually identical. Despite that computations are different and the results - regression and discriminant coefficients - are not the same, they are exactly proportional to each other.
Now for the more-than-two-groups situation. First, let us state that LDA (its extraction, not classification stage) is equivalent (linearly related results) to canonical correlation analysis if you turn the grouping DV into a set of dummy variables (with one redundant of them dropped out) and do canonical analysis with sets "IVs" and "dummies". Canonical variates on the side of "IVs" set that you obtain are what LDA calls "discriminant functions" or "discriminants".
So, then how canonical analysis is related to linear regression? Canonical analysis is in essence a MANOVA (in the sense "Multivariate Multiple linear regression" or "Multivariate general linear model") deepened into latent structure of relationships between the DVs and the IVs. These two variations are decomposed in their inter-relations into latent "canonical variates". Let us take the simplest example, Y vs X1 X2 X3. Maximization of correlation between the two sides is linear regression (if you predict Y by Xs) or - which is the same thing - is MANOVA (if you predict Xs by Y). The correlation is unidimensional (with magnitude R^2 = Pillai's trace) because the lesser set, Y, consists just of one variable. Now let's take these two sets: Y1 Y2 vs X1 x2 x3. The correlation being maximized here is 2-dimensional because the lesser set contains 2 variables. The first and stronger latent dimension of the correlation is called the 1st canonical correlation, and the remaining part, orthogonal to it, the 2nd canonical correlation. So, MANOVA (or linear regression) just asks what are partial roles (the coefficients) of variables in the whole 2-dimensional correlation of sets; while canonical analysis just goes below to ask what are partial roles of variables in the 1st correlational dimension, and in the 2nd.
Thus, canonical correlation analysis is multivariate linear regression deepened into latent structure of relationship between the DVs and IVs. Discriminant analysis is a particular case of canonical correlation analysis (see exactly how). So, here was the answer about the relation of LDA to linear regression in a general case of more-than-two-groups.
Note that my answer does not at all see LDA as classification technique. I was discussing LDA only as extraction-of-latents technique. Classification is the second and stand-alone stage of LDA (I described it here). @Michael Chernick was focusing on it in his answers.
Best Answer
They are not the same thing. This question in various forms is asked & answered multiple times on this site, see for example Discriminant analysis vs logistic regression or Linear discriminant analysis and logistic regression and especially Why isn't Logistic Regression called Logistic Classification?. Read those posts carefully!
Summary:
While logistic regression (binomial or multinomial is unimportant here) can be used for classification, that requires some extra decisions, like probability thresholds for classes.
logistic regression is not classification, it is risk estimation.
logistic regression is often more robust that LDA, since it does not use any assumption, like normal distribution, on the predictors. Such an assumption is rarely fulfilled.