Solved – Hypothesis testing on Poisson Binomial distribution

hypothesis testingmultiple-comparisonspoisson-binomial-distributionstatistical significance

Imagine $N$ baskets each filled with a different number of products chosen from a set $P$. It is possible to have repeated products in the same basket.

For instance:

basket 1 = [A,B,B,C]
basket 2 = [A,B,C,D,F,J]
basket 3 = [C]
...
basket N = [E,E,E,F,F]

Now we ask someone to chose one item from every basket. For instance:

basket 1 = [A,B,B,C] - A
basket 2 = [A,B,C,D,F,J] - A
basket 3 = [C] - C
...
basket N = [A,E,E,E,F,F] - E

How can we test the preference of this person for a given product?

EDIT:

The problem can be simplified into this one:

We have $N$ independent trials where every trial $i$ follows a Bernoulli distribution with probability $p_i$. That is, a Poisson Binomial distribution.

Imagine that we observed $k$ successful trials.

Is there a way to compute the deviation of the observation respect to the distribution, and how significant is that deviation?

(I'm thinking of p-values and z-scores in a normal distribution)

Best Answer

You can just use the number of successes itself as the test statistic.

If you want a one-tailed test this is simple (and I expect you will). For a two tailed test, because of the asymmetry, the usual approach would be to allocate $\alpha/2$ to each tail and compute a rejection region that way. There's some variation in exactly how this gets implemented since you can't get exactly $\alpha/2$ in either tail, but p-values are slightly complicated (you base them on whatever rules you come up with for how your rejection region would actually be computed).

1) If the $p_i$s are known and all are small, you can use a Poisson approximation for the number of successes.

If the p's are all pretty small, the mean of the sum is close to the variance of the sum, so one quick assessment of whether they're small enough is to compare the variance to the mean; if it's pretty close, this will normally work fairly well. What constitutes 'close' depends on your criteria, you really need to calibrate it yourself. As a very rough rule of thumb I'd suggest variance/mean above 0.9, but you may want it higher. If you want a more precise rule about when the Poisson will work, Le Cam's theorem bounds the sum of absolute deviations between the true probability function and the Poisson approximation (though this doesn't necessarily tell you how well it performs in the part of the tail that determines the accuracy of a nominal significance level).

2) If the $p_i$ are not necessarily small but there are a lot of them you can use a normal approximation (perhaps with continuity correction). Assuming the p's are typically less than 0.5, a simple rule would be if the coefficient of variation for the number of successes is sufficiently* small, the normal approximation should be fine.

* what constitutes 'sufficiently' depends on your criteria, again, you really need to calibrate it yourself. But as a very rough rule of thumb, I'd suggest that the reciprocal of the coefficient of variation should be more than 4 if you want roughly accurate p-values around 0.05 (one tailed), more than 5 if you want roughly accurate p-values around the 2.5% level, and more than 7 if you want reasonable accuracy around the 1% level. Those are pretty rough, though, you may well want more accuracy than that.

3) If there's very little variation in the $p_i$'s, you could use a binomial approximation.

4) You can simulate the distribution of the number of successes directly.

5) The probability function for the convolution of the Bernoulli($p_i$) variables is fairly easy to compute numerically. For small numbers of variables (up to a couple of dozen easily), it can be done by brute force with no difficulty. [For large numbers, you're probably better off using FFT, as it's much faster, though you'll likely hit the point where normal approximation is very accurate long before it's much of an issue.]

In each of the cases (1) to (3), you can check the quality of the approximation via simulation... but if you do that, you might as well get the p-value the same way.

Incidentally, the R package poibin offers four calculations or approximations (it does not include the Poisson). There's an associated paper

Hong, Y. (2012),
"On computing the distribution function for the Poisson binomial distribution."
Computational Statistics & Data Analysis.
http://dx.doi.org/10.1016/j.csda.2012.10.006. (Tech Report here)

where the author derives an expression based on the discrete Fourier transform.


Here's an example, using the Poisson approximation:

$p_1, ..., p_5 = 0.05, p_6, ...,p_9 = 0.10$

The mean is 0.65 and the standard deviation = $\sqrt{0.05\times 0.95\times 5 + 0.10 \times 0.90 \times 4} \approx 0.773$

The coefficient of variation rules out the normal approximation. The variance divided by the mean is 0.92, which suggests the Poisson may not do too badly.

The possible (typical range) one tailed significance levels for a Poisson(0.65) are 13.8%, 2.8%, 0.44% ... Let's say we choose 2.8%, which is to say if we see 3 or more successes we'll reject the null.

Now, the exact calculation is simply the convolution of a Binomial(5,0.05) and a Binomial(4,0.10), and this immediately is:

             0         1         2         3         4         5        6       7 ... 
exact 0.5076777 0.3592339 0.1110464 0.0196723 0.0022006 0.0001612 7.70e-06 2.0e-07  
Pois  0.5220458 0.3393298 0.1102822 0.0238945 0.0038829 0.0005048 5.47e-05 5.1e-06 

As you see they're at least close-ish except in the far tail (up to X=3 say); the exact significance level for a rejection rule of 'reject if there are at least 3 successes' is about 2.2%, while the Poisson gave about 2.8%. For my purposes that would be reasonable, but your own needs may differ.