Solved – Poisson distribution and statistical significance

distributionspoisson distributionrstatistical significance

Lets say I have a website which gets 100 hits per day (mu = 100). Yesterday my website got 130 hits (x = 130). If I assume a Poisson distribution, then the probability of getting 130 hits is:

> dpois(130, 100)
[1] 0.0005752527 # about 0.06%

So this tells me that getting 130 hits is quite unusual for my website due to the low probability.

My understanding of statistical significance is that it is used to determine whether the outcome of an experiment is due either to chance or some kind of deterministic relationship.

  1. How would I apply that in this situation?
  2. What test should one use? (and is it in R?)

Many thanks in advance for your time.

Note: I saw someone at a business talk asked something very similar to this and I had no idea what they meant by it, and so now I'm just trying to educate myself. I'm new to R, but that seems like the software most used for these kind of questions, hence my request.

Best Answer

There are two points to make:

  1. It is not the specific value of 130 that is unusual, but that it is much larger than 100. If you got more than 130 hits, that would have been even more surprising. So we usually look at the P(X>=130), not just P(X=130). By your logic even 100 hits would be unusual, because dpois(100,100)=0.04. So a more correct calculation is to look at ppois(129, 100, lower=F)=0.00228. This is still small, but not as extreme as your value. And this does not even take into account, that an unusually low number of hits might also surprise you. We often multiply the probability of exceeding the observed count by 2 to account for this.
  2. If you keep checking your hits every day, sooner or later even rare events will occur. For example P(X>=130) happens to be close to 1/365, so such an event would be expected to occur once a year.
Related Question