I am trying to determine if there is a dependency between two survey questions about the effectiveness of a subject-specific online communities. The first survey question asks how frequently a user posts questions or comments. The second survey question asks if the participating in the community increases the user's confidence in that subject area.
I think I need to to use a Chi-Square Test for Independence. However, I am not sure how to calculate it, especially given the answer choices.
The possible answer choices for Question 1 are:
- Never
- 1-5 per month
- 6-10 per month
- 11+ per month
and for Question 2 are:
- Strongly disagree
- Disagree
- Neither agree nor disagree
- Agree
- Strongly agree
Is the Chi-Square Test for Independence the correct test, and if so, how do I use it against the data? Does the data need to be grouped? For instance, should I group the answers for Question 1 to a binary choice of Never Post or Sometimes Post?
Thank you for your assistance and expertise.
====
Further questions.
- I can easily obtain the observed number from the survey data for each cross-section (e.g. Strong Disagree and Never, Agree and 1 to 5 times, etc.). But where do I find my expected number?
- Am I comparing 1 to 5 times, 5 to 10 times, and 11 or more to Never (Never being the baseline or expected amount)?
Best Answer
I agree with @rolando2's suggestion that Spearman's/Kendall's might be better suited. In general doing this by hand is just inconvenient but if you want to have a look at this excellent Khan Academy clip that shows exactly how to do a $\chi^2$ test.
I suggest you use some software, my current favorite is R together with RStudio as your IDE
First create your dataset, preferably in a spreadsheet and then import it but you can also create the data in R:
Then the $\chi^2$ test
If you have a cell with few outcomes (5 or less) you should use Fisher's exact test:
For the Spearman method use:
And for the Kendall use:
Hope this helped