In Stata, this can be done with the immediate versions of prtest
. The immediate version obtains data from either group sizes and proportions or group sizes and success counts that are typed as arguments and not from the data stored in memory.
Here are the two ways of doing this:
. prtesti 10 .7 10 .8
Two-sample test of proportions x: Number of obs = 10
y: Number of obs = 10
------------------------------------------------------------------------------
Variable | Mean Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | .7 .1449138 .4159742 .9840258
y | .8 .1264911 .552082 1.047918
-------------+----------------------------------------------------------------
diff | -.1 .1923538 -.4770066 .2770066
| under Ho: .1936492 -0.52 0.606
------------------------------------------------------------------------------
diff = prop(x) - prop(y) z = -0.5164
Ho: diff = 0
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(Z < z) = 0.3028 Pr(|Z| > |z|) = 0.6056 Pr(Z > z) = 0.6972
. prtesti 10 7 10 8, count
Two-sample test of proportions x: Number of obs = 10
y: Number of obs = 10
------------------------------------------------------------------------------
Variable | Mean Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | .7 .1449138 .4159742 .9840258
y | .8 .1264911 .552082 1.047918
-------------+----------------------------------------------------------------
diff | -.1 .1923538 -.4770066 .2770066
| under Ho: .1936492 -0.52 0.606
------------------------------------------------------------------------------
diff = prop(x) - prop(y) z = -0.5164
Ho: diff = 0
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(Z < z) = 0.3028 Pr(|Z| > |z|) = 0.6056 Pr(Z > z) = 0.6972
In this case, you cannot reject the null that there's no difference in the proportion between the two groups. One caveat is that this is a large sample test, but you have very small sample sizes. Given these proportions, you would need a sample of almost 600 observations (compared to 20) for a two-sided alternative at 5% and equal size groups:
. power twoproportions 0.7 0.8, nratio(1) alpha(.05) test(chi2)
Performing iteration ...
Estimated sample sizes for a two-sample proportions test
Pearson's chi-squared test
Ho: p2 = p1 versus Ha: p2 != p1
Study parameters:
alpha = 0.0500
power = 0.8000
delta = 0.1000 (difference)
p1 = 0.7000
p2 = 0.8000
Estimated sample sizes:
N = 588
N per group = 294
One alternative would be to use Fisher's exact test, where the null is that the rows do not affect column outcomes, or that the two are independent. Here's how you can do that in Stata, though you have to specify the number of observations in each cell:
. tabi 7 3 \ 8 2, exact
| col
row | 1 2 | Total
-----------+----------------------+----------
1 | 7 3 | 10
2 | 8 2 | 10
-----------+----------------------+----------
Total | 15 5 | 20
Fisher's exact = 1.000
1-sided Fisher's exact = 0.500
The p-values are very large for both the one-sided and two-sided tests, so we still cannot reject the null that the proportions are the same.
First, let's see if there are differences in the proportion working across
the four groups A, B, C, D. (Data similar to yours.)
w = c(90, 32, 9, 3)
nw = c(46, 7 , 8, 5)
TBL = rbind(w, nw)
chisq.test(TBL)
Pearson's Chi-squared test
data: TBL
X-squared = 8.7062, df = 3, p-value = 0.03346
Warning message:
In chisq.test(TBL) :
Chi-squared approximation may be incorrect
The low cell counts in groups C and D, trigger a warning message, putting
the validity of the P-value into doubt. The version of 'chisq.test` implemented in R, allows for simulation of a more accurate P-value, showing a significant effect at the 5% level.
chisq.test(TBL, sim=T)$p.val
[1] 0.03098451
Significance barely at the 5% level does not invite extensive ad hoc
tests. To avoid false discovery they should show significance at lower levels.
Furthermore, it is not clear just which confidence intervals would be of interest. A look at the Pearson residuals to see if there groups that
are strikingly different, possibly suggests comparing groups A and B. However, the level of significance there is unimpressive, especially if we
protect against false discovery.
chisq.test(TBL)$resi
[,1] [,2] [,3] [,4]
w -0.1173306 1.148334 -0.7081676 -1.019365
nw 0.1671828 -1.636247 1.0090588 1.452480
chisq.test(TBL[,c(1,2)], cor=F)
Pearson's Chi-squared test
data: TBL[, c(1, 2)]
X-squared = 3.6176, df = 1, p-value = 0.05717
You have already said you know how to use 'prop.test' to get a 95%
confidence interval for the difference of proportions in A and B.
I don't see a point in looking at other pairs of groups---especially not, in view of the low
counts there. Maybe you would like to compare group A with the other three groups combined, but 'prop.test' can handle that.
If you had additional kinds of analyses in mind using confidence intervals, please be more specific, and maybe one of us can help.
Best Answer
Note that the first article you mentioned is about two "independent" populations. In polling, only a "single" population is used, and by so the sample proportion values of candidate A and B are related. While in two independent population tests as suggested in the first article, the sample means and generally the two populations are assumed to be fully independent. Probably that is the reason that comparing the CIs for the yes/no question for a single population (as in presidential voting) works fine.