Solved – Chi-square using factors with multiple levels in R

categorical datachi-squared-testr

I'm not sure if I have the right concept of how to perform the chi-square test. I have a variable called race which is a factor with multiple levels for different races, and I would like to see whether there is a correlation between each race and the response variable (mortality). If I were to perform chi-square like:

chi.race <- chisq.test(mortality, race)

then it gives me an output like:

Pearson's Chi-squared test
data:  mortality and race
X-squared = 4.9626, df = 9, p-value = 0.8376

but it doesn't tell me anything about the correlation between each level in race (e.g. White, Black, Asian, Hispanic, etc.) and mortality.

What am I doing wrong and/or what is wrong with my concept of chi-square? Thanks for the help.

Best Answer

The chisquare is a hypothesis test for differences from independence in the counts in your table.

If you want to test that you're probably not doing anything wrong.

You can produce a table of contribution to chi-square or a table of Pearson residuals which help to identify which parts of the table contribute most to the differences.

However, it sounds what you're actually interested in is a different question, perhaps more like an effect size type estimate.

It sounds like mortality may be ordered, is that right? Or is it just two categories (like 'alive, dead')?

In the second case you could order them by proportion in one or the other category and produce a display, including something related to a standard error of proportions, if you wanted to see which ones were most clearly different.

Related Question