Solved – Large overlap between confidence intervals, although z test for difference was significant

confidence intervalproportion;z-test

I have calculated a z test to examine the difference between two proportions. I found differences to be significant. The formula I used was:

$$z = (p_1-p_2)/SE$$
where
$$SE = \sqrt{ p ( 1 – p ) ( \frac{1}{n_1} + \frac{1}{n_2} ) }$$

and $n_1$ is the sample size of sample 1, $n_2$ sample size of sample 2.

I have also plotted a bar graph of my proportions with 95% confidence intervals around each of the two proportions based on this equation:

For proportion 1:
$p_1 \pm 1.96SE$

$SE = \sqrt{p(p-1)/ n_1}$; $n_1$: size of the sample 1

For proportion 2:

$p_2 \pm 1.96SE$

$SE = \sqrt{p(p-1)/ n_2}$; $n_2$: size of the sample 2

I would not expect the CIs to overlap much based on significant of z test. However, they overlap by more than 25%. Am I using the correct formula for the confidence intervals? If so why does there seem to be discrepancy between the results of the z test and the visual depiction of the difference

Best Answer

Statistics is a pretty large field and it is good to invest in understanding the fundamentals before attempting to use statistics. What you stated is a common misconception about statistics. Individual confidence intervals make an inference about one quantity of interest. You need a confidence interval for the difference in two such quantities. There are many confidence intervals developed for the difference in two probabilities. It is easy to have individual intervals overlap whereas the interval for the difference excludes zero. The converse is not true however. If the interval for the difference includes zero, the two individual intervals must overlap. Note that it is not typical for the two individual intervals to be of much interest in study in which subjects are not a simple random sample from the population to which inference is desired.

Related Solutions

Solved – reference that legitimises the use of the unpooled z-test to compare two proportions

There is quite a bit of discussion about this on the AP site.

You can use whatever statistic you want, provided that you are clear about what you do and look at the appropriate null distribution to calculate p-values or thresholds.

But some statistics are better than others; in this case you'd be looking for (a) null distribution easily calculated and (b) power to detect difference.

But I don't know why you'd favor the unpooled variance over the pooled variance for the test, though it could be preferred in calculating a confidence interval for the difference.

Solved – How to test for equality of means for proportions? And two-sided or one-sided

here's what R's prop.test (Test of Equal or Given Proportions) says:

> p = c (0.559, 0.555)
> n <- c(16753, 5378)
> n*p
[1] 9364.927 2984.790
> prop.test (round (n*p), n)

    2-sample test for equality of proportions with continuity correction

data:  round(n * p) out of n 
X-squared = 0.2437, df = 1, p-value = 0.6215
alternative hypothesis: two.sided 
95 percent confidence interval:
 -0.01141974  0.01935036 
sample estimates:
   prop 1    prop 2 
0.5590044 0.5550390

So far, I cannot see any reason in your question why $p_{Single}$ cannot be larger than $p_{Married}$.
Obviously, neither did you specify the alternative hypothesis that $p_{Married} > p_{Single}$ beforehand (there'd not be any discussion about it now).
So unless you/your prof can give a hard reason why $p_{Married} \leq p_{Single}$ (like a law that Singles are allowed to enter the group only if they bring each at least one Married who is not yet member into the group), the test should be two-sided.

Deciding for "larger" afterwards just because the observed $p_{Married}$ happens to be a bit larger than the observed $p_{Single}$ at the third digit is cherry-picking.

Best Answer

Related Solutions

Solved – reference that legitimises the use of the unpooled z-test to compare two proportions

Solved – How to test for equality of means for proportions? And two-sided or one-sided

Related Question