Solved – Test whether difference in proportions differs from a non-zero constant

hypothesis testingproportion;

I am using the prop.test function in R to test the differences of proportions against the alternative that the difference of the two proportions are significantly different from zero, however I wish to perform the same test, but instead of testing if they are significantly different from zero I want to test if they are different from some constant (lets say C) where more often then not C is not equal to 0.

Does anyone know of a function in R to do this?

Best Answer

The standard formula for testing equality of 2 proportions (using the normal approximation) uses a pooled estimate of the proportion that is appropriate when the null of equal proportions is true. In your case the proportions are not equal, so the pooled proportion is not appropriate.

One option is that you can code the formula that does not pool the proportion, then compute the p-value, etc. from the normal approximation.

Another option is to just use prop.test but ignore the p-value part and look to see if the confidence interval includes the C value that you are interested in. If C is not in the interval then that is equivalent to rejecting the null and if C is in the interval then that is equivalent of a p-value greater than alpha (not enough evidence to reject). You don't get an exact p-value, but you get the same decision.

Related Solutions

Solved – is the z-test for difference of proportions valid for massive samples with tiny proportions

Whenever I have doubts about the performance of a particular method, I try to run a simulation study to examine how well the method works under similar conditions. Below is a simple example using R for the case you are describing. Note that I set the true proportions equal for the two groups and to a value that is somewhere in between what you actually observed in the two samples. Therefore, the simulation provides the empirical Type I error rate of the test. It should hopefully be close to .05. Setting the number of iterations large enough will ensure that the simulation error is small. Also, note that I once run the test without and once with Yates' continuity correction to see whether this is relevant here.

iters <- 100000

n <- 23000
p <- 0.0027

x1i <- rbinom(iters, n, p)
x2i <- rbinom(iters, n, p)

pval1 <- rep(NA, iters)
pval2 <- rep(NA, iters)

for (i in 1:iters) {
   pval1[i] <- chisq.test(matrix(c(x1i[i], n-x1i[i], x2i[i], n-x2i[i]), nrow=2, byrow=TRUE), correct=FALSE)$p.value
   pval2[i] <- chisq.test(matrix(c(x1i[i], n-x1i[i], x2i[i], n-x2i[i]), nrow=2, byrow=TRUE), correct=TRUE)$p.value
}

round(mean(pval1 <= .05), 3)
round(mean(pval2 <= .05), 3)

Here are the results from one run:

> round(mean(pval1 <= .05), 3)
[1] 0.05
> round(mean(pval2 <= .05), 3)
[1] 0.04

So, the test performs nominally when not using Yates' continuity correction. With the correction, the test is slightly conservative.

If you want to find out about the power of the test, you can set the true proportions to two different values and then rerun the simulation.

Best Answer

Related Solutions

Solved – is the z-test for difference of proportions valid for massive samples with tiny proportions

Related Question