Logistic Regression – Logistic vs Chi-Square in 2×2 and Ix2 Contingency Tables

chi-squared-testcontingency tableslogisticlogit

I'm trying to understand the use of logistic regression in 2×2 and Ix2 contingency tables. For instance, using this as an example

enter image description here

What is the difference between using chi-square test and using logistic regression? What about a table with multiple nominal factors (Ix2 table) like this:

enter image description here

There is a similar question here – but the answer is mainly that chi-square can handle mxn tables, but my question is what is specificalyl for when there is a binary outcome and a single nominal factor. (The linked thread also refers to this thread, but this is regarding mutiple variables/factors).

If it's just a single factor (i.e. no need to control for other variables) with a binary response, what is the purpose difference of doing logistic regression?

Best Answer

Ultimately, it's apples and oranges.

Logistic regression is a way to model a nominal variable as a probabilistic outcome of one or more other variables. Fitting a logistic-regression model might be followed up with testing whether the model coefficients are significantly different from 0, computing confidence intervals for the coefficients, or examining how well the model can predict new observations.

The χ² test of independence is a specific significance test that tests the null hypothesis that two nominal variables are independent.

Whether you should use logistic regression or a χ² test depends on the question you want to answer. For example, a χ² test could check whether it is unreasonable to believe that a person's registered political party is independent of their race, whereas logistic regression could compute the probability that a person with a given race, age, and gender belongs to each political party.

Related Question