Chi-Squared Test – Paired Comparison Chi Square

chi-squared-testmcnemar-test

I have a question related to the use of chi square test for paired data. I have read that McNemar tests would be an option but in some software like R it works only for 2×2 data (may be I do not know the correct way to do it) and my data is more than a 2 x 2.

These are counts for individuals before and after applying an insecticide. Some people suggested a paired t test could work, but I am not too sure about it since my data are counts.

Data looks like the table below.

\begin{array}{|l|r|r|}
&\text{Before}&\text{After}\\ \hline
\text{Species A}&30&12\\ \hline
\text{Species B}&30&7\\ \hline
\text{Species C}&30&6\\ \hline
\text{Species D}&30&2\\ \hline
\text{Species E}&30&4\\ \hline
\end{array}

Best Answer

I think you're looking at this the wrong way. You're trying to compare the proportion of insects left after applying insecticide. The 'before' aren't a random sample, but the experimental setup. That is:

\begin{array}{l|c|c|c} & &\text{count left }&\\ &n \text{ exposed} &\text{after insecticide}&\text{proportion left}\\ \hline \text{Species A}&30&12&12/30\\ \hline \text{Species B}&30&7&7/30\\ \hline \text{Species C}&30&6&6/30\\ \hline \text{Species D}&30&2&2/30\\ \hline \text{Species E}&30&4&4/30\\ \hline \end{array}

This is in effect a straight chi-square test, or you could use a binomial GLM.

To present as a chi-squared test you'd write two columns, the number remaining and the number dead (or missing or gone or whatever it is that happened), for each species and do a test of independence in the two-way table, which serves as a test of equality of proportion.

Edit - Like so:

\begin{array}{l|r|r|r} &\text{Survived}&\text{Died}&n \text{ exposed}\\ \hline \text{Species A}&12&18&30\\ \hline \text{Species B}&7&23&30\\ \hline \text{Species C}&6&24&30\\ \hline \text{Species D}&2&28&30\\ \hline \text{Species E}&4&26&30\\ \hline \text{Total}&31&119&150 \end{array}

Edit2: Here's a chi-squared test done in R; as you see it agrees with the values in Nick Cox's comment.

 alive=c(12,7,6,2,4)
 dead=30-alive
 chisq.test(cbind(alive,dead))

        Pearson's Chi-squared test

data:  cbind(alive, dead) 
X-squared = 11.5478, df = 4, p-value = 0.02105

Edit 3: answering followup questions from comments:

I would like to know if there is a test which allows me to make post-hoc comparisons between the species

The issues are much as they are with ANOVA

(i) If you have orthogonal contrasts: You can partition the chisquare into the orthogonal contrasts to test those. These contrasts are usually obvious a priori, and specified in advance.

(ii) If you want all pairwise comparisons (I assume you meant this option): You can do a series of 2-species comparisons with, if you wish, the typical sorts of adjustments for multiple testing (Bonferroni is trivial to do, for example, but conservative; you might use Keppel's modification of Bonferroni or a number of other options). You could alternatively look at multiple comparisons via simultaneous confidence intervals (Agresti et al. 2008 Simultaneous confidence intervals for comparing binomial parameters. Biometrics no.64 p. 1270-1275.)

Note that for some 2x2 comparisons, the expected counts are low; e.g. for D vs E the expected counts in two cells are only 3. This is not as big of a problem as it's made out to be (a variety of less conservative rules from the last 4 or 5 decades would say it's fine), but you can always either simulate the discrete distribution of the test statistic, or you can do an exact calculation of the p-value by complete enumeration of the tail. Personally, for those expecteds I wouldn't bother, they're absolutely fine.

(iii) If you're more interested in "which groups stand out" ('what made this significant?'), the usual approach would be to look at some form of standardized residual (such as a Pearson residual) or a contribution to chi-square. An alternative would be to collapse the tables to do 2x2 comparisons of each one against all the others.

Related Question