Solved – Testing for differences in very small proportions

This is a somewhat typical a/b test setup. However the proportions that I'd like to test for differences are very small (< 1%). Consequently a 20% difference, which is meaningful, is 0.001. That's a small effect size!

A power test of the chi-squared test for differences in proportions (same sample size) using the pwr package in R yields

difference of proportion power calculation for binomial distribution 
  (arcsine transformation) 

          h = 0.001
         n1 = 1e+05
         n2 = 1e+05
  sig.level = 0.05
      power = 0.05574725
alternative = two.sided

So power grows very slowly. How can I get around this?

The group a proportion is lower than group b (which contains the treatment we're interested in). As expected the group b proportion is higher than group a. Say group bs proportion is 0.007.

Can I use an exact binomial test on group a with the alternative based on being greater than group bs proportion?

When I do this I get a p-value of 0.97, indicating that it's very unlikely that group a will increase in conversion to trump group bs current position.

I guess the question becomes one of power with this test? Any comments on the validity of this?

Best Answer

I find that it helps to think in events rather than proportions to get the general scale needed, then go to more precise power calculation. For rare events, the sampling error is related to the square root of the number of events. So if your group b has a proportion of 0.007, that's 700 expected events in a sample of 100,000 cases, with a sampling error of around 25 events. So it seems that you shouldn't be so far away from adequate power as your output from the pwr package suggests; a proportion of 0.008 in a sample of 100,000 cases has 800 expected events.

Double-check that the input to the program you used in the pwr package is correct. I don't use it, but it seems that there is a specific definition of "effect size" in the ES.h() program in that package. Using that formula for proportions of 0.007 and 0.008 gives me an "effect size" of 0.011, not the simple proportion difference of 0.001 you seemed to have specified in calling the program.

You can't get away from the need for large numbers of cases with low proportions, but things might not be quite so bad for your present application as you fear.

Best Answer

Related Solutions

Solved – How many trials are needed to get a statistically important proportion of 0.003 for a binomial variable

Related Question