Chi-Square Test – How to Perform Chi-Square Test When Respondents May Be Assigned to Multiple Categories

chi-squared-testcontingency tableshypothesis testing

I am referring to this example: http://www.sthda.com/english/wiki/chi-square-test-of-independence-in-r

The data used are frequencies for household tasks:

	Wife	Alternating	Husband	Jointly
Laundry	156	14	2	4
Main_meal	124	20	5	4
Dinner	77	11	7	13
Breakfast	82	36	15	7
Tidying	53	11	1	57
Dishes	32	24	4	53
Shopping	33	23	9	55
Official	12	46	23	15
Driving	10	51	75	3
Finances	13	13	21	66
Insurance	8	1	53	77
Repairs	0	3	160	2
Holidays	0	1	6	153

My questions:

Normally, in a contingency table for chi square, the categories (rows) should be independent. In this data set, I am not sure if the surveyed households were assigned to more rows. In such a case the chi square would be not appropriate or?
Considering each household was just asked about one activity (row), would it then be ok to use the chi square test like in the example?

Best Answer

To give some context, the table comes from p.381 of "Nonsymmetric Correspondence Analysis: A Tool for Analysing Contingency Tables With a Dependence Structure" by Kroonenberg & Lombardo (1999) https://doi.org/10.1207/S15327906MBR3403_4 (in fact they adapted data collected by other researchers in the 1970s in Germany, their article explains all of this in detail). The article is also freely available here for download, the part about the household tasks study is pp.379-384.

223 young, childless married couples were asked to answer who performs primarily a given task in the household, for each task. So the same household is theoretically counted in each row, which indeed would be problematic for conducting a chi-square test on this dataset (see Is it okay to run a chi square if each participant is contributing multiple counts?). But anyway, as Peter Flom says in his answer, it would be redundant to conduct a chi-square test in the first place, given that there are very obvious differences between rows.

You may have noticed that the sum of each row does not add up to 223, and is different between each row. It's due to responses being excluded from the table when the husband and the wife disagreed on who performs the task. In their article, Kroonenberg and Lombardo discuss why a "disagreement" column was not included in the table to take into account such cases.

The core of their article also discusses a possible method to analyze this kind of data using a variant of correspondence analysis (nonsymmetric correspondence analysis), but using a regression as suggested by Peter Flom in his answer may be a good option depending on the question you want to answer ultimately. Note that in their analysis, Kroonenberg and Lombardo treated "tasks" as predictor variable, and "who performed the task" as the response variable.

As a side note, it looks like that the author of the webpage you link to in your question is unaware of the original source of the data and its study design, as they don't mention it. It probably explains why they thought a chi-square test was suitable for this table.

Best Answer

Related Solutions

Solved – Chi-square test or Fisher’s Exact test

Related Question