Solved – test to use while comparing two frequencies

hypothesis testinginferencerstatistical significance

I have a column A with the following frequency distribution for all the values in that column A

Value      Frequency
3          292
4          71
5          47
6          62
7          22
8          12
9          22

I have another column B , similarly, these are the frequency distribution for all the values in that column B

Value      Frequency
3          274
4          71
5          46
6          62
7          22
8          12
9          22

Please note that the frequencies for value 3 in Column A is 292 and frequency for value 3 in column B is 274. My goal is to find out if the frequencies for value 3 in column A is statistically different than frequency in column B for value 3.
I am guessing I cannot use Chi-Square test ? So what test should I use to test whether the frequencies for value 3 in these two columns A, B are similar or different ?

Best Answer

If you are mainly interested in the proportion of outcomes taking Value 3, then it seems best to compare that proportion in A, which is $292/528 = 0.553,$ with that proportion in B, which is $274/509 = 0.538.$ The difference seems quite small.

A formal test (here done in Minitab) shows that this difference is not significant at the 5% level (P-value $0.635 > 0.05).$ Also notice that a 95% confidence interval for the population difference covers $0$ (no difference).

Test and CI for Two Proportions 

Sample    X    N  Sample p
1       292  528  0.553030
2       274  509  0.538310

Difference = p (1) - p (2)
Estimate for difference:  0.0147199
95% CI for difference:  (-0.0458945, 0.0753343)
Test for difference = 0 (vs ≠ 0):  Z = 0.48  P-Value = 0.634

This test uses a normal approximation of the difference between two binomial proportions, which should be very accurate for your sample sizes above 500.

Notes: [a] You could also do a chi-squared test of the null hypothesis that the proportions of outcomes with Values 3 through 9 are 'homongeneous' for A and B. (Computations are the same as for a test of 'independence' between Values (3 through 9) and Types (A and B). That test also does not give a significant result.

Pearson Chi-Square = 0.235, DF = 6, P-Value = 1.000

[b] I do not see how it would be appropriate to use a t test to answer this question.

Related Question