Solved – Calculating a confidence interval and required sample size for the mean time to next event of a Poisson process

chi-squared-distributionconfidence intervalinverse gamma distributionpoisson distributionsample-size

I am fairly new to stats and the field of confidence intervals, so I apologize if the question seems trivial.

I am working on a physics experiment and I am measuring the time to occurrence of a particular event. I know occurrence of this event follows a Poisson process.

My question is, if I calculate the average time to occurrence based on 10 such events, how do I calculate the upper and lower error bound (in other words, the confidence interval) of the estimated average. For instance, if I want the estimated average time to occurrence to be within $\pm 20\%$ of the actual mean based on a confidence level, how many samples do I need to record?

Someone told me that I need to use inverse chi-squared distribution, but I would like to have a more in depth tutorial/info in this.

Best Answer

The first thing is that with a constant intensity Poisson-process with rate $\lambda$ per unit time, the time between events has an exponential distribution with mean $\mu=1/\lambda$.

Second, you actually ask two separate questions, so let's deal with them one at a time.

if I calculate the average time to occurrence based on 10 such events, how do I calculate the upper and lower error bound (in other words, the confidence interval) of the estimated average.

If you average ten such events, the mean is distributed as $\text{gamma}(10,\mu/10)$ (this is the gamma parameterized in the shape-scale form, not the shape-rate form)

If you want a $1-\alpha$ interval for the population mean, then one approach is to construct a pivotal quantity for the mean, which is to say a quantity $Q$ which is a function of the data and the parameter whose distribution doesn't depend on the parameter.

Note that $\hat{\mu}/\mu$ will also have a gamma distribution, it will be distributed as $\text{gamma}(10,1/10)$, so that is a pivotal quantity (and works just fine for this), but it may be easier to work with its inverse $\mu/\hat{\mu}$, which will be distributed as an inverse-gamma distribution, specifically $\text{inverse-gamma}(10,1/10)$.

Similarly, for a sample of size $n$, the distribution of $\mu/\hat{\mu}$ will be $\text{inverse-gamma}(n,1/n)$.

(The inverse chi-square is a particular case of this distribution, and it, too can be used for this purpose, with a suitable rescaling. So the advice you were given was correct, if not immediately helpful to you.)

(I'll assume you want a symmetric (in probability) interval.)

Let $l$ be the $\alpha/2$ quantile of an $\text{inverse-gamma}(n,1/n)$ distribution, and let $u$ be the $1-\alpha/2$ quantile - that is, the values that cut off a lower and upper tail area of $\alpha/2$.

Then $P(l<Q<u) = 1-\alpha$, so a $1-\alpha$ interval for $Q$ is $(l,u)$. Hence we can work out the interval for $\mu$:

$$l<\mu/\hat{\mu}<u$$

$$l\,\hat{\mu}<\mu<u\,\hat{\mu}$$

where $\hat{\mu}=\bar x$, which is to say a $1-\alpha$ interval for $\mu$ is $(l\,\bar x,u\, \bar x)$

Imagine the sample size was 10, and that the mean time was 3.21 seconds. Further, imagine you wanted a 95% interval.

I'm going to work in R, but anything that will give you gamma, inverse gamma, chi-square or inverse-chi-square quantiles will work.

So here's the 0.025 and 0.975 quantiles of a gamma(10,1/10):

 qgamma(.025,10,scale=.1)
[1] 0.4795389
> qgamma(.975,10,scale=.1)
[1] 1.70848

The corresponding cutoffs for an inverse gamma will be the inverses of these, flipped around:

> 1/qgamma(.975,10,scale=.1)
[1] 0.5853155
> 1/qgamma(.025,10,scale=.1)
[1] 2.085337

So that's our $l$ and $u$ respectively.

Hence a 95% interval for $\mu$ in this example is $(3.21\times 0.5853, 3.21\times 2.0853)$, or $(1.879, 6.694)$

That interval is far bigger that $\pm 20\%$, of course.

If you want to do it in terms of an inverse-chi-square instead, note that a $χ²(\nu)$ is a $\text{gamma}(\nu/2, 1/2)$; from there it's reasonably easy to convert the above calculations.

if I want the estimated average time to occurrence to be within ±20% of the actual mean based on a confidence level, how many samples do I need to record?

This is basically the inverse of the above problem, where we specify information about the interval and solve for $n$

Edit: the approach I took here before didn't quite solve the problem as stated, but a related problem (I solved the true mean being within 20% of the estimates). This version should be right:

An interval that's like this: $0.8 \mu <\hat \mu < 1.2 \mu$

is equivalent to (after dividing by $\mu\,\hat\mu$ and inverting):

$0.833 \hat\mu < \mu < 1.25 \hat\mu$

which is in the right form to play with.

So our interval is $l_n \hat{\mu}<\mu<u_n \hat{\mu}$.

You want to choose $n$ that $u_n>1.25$ and that $l_n<0.833$.

Unfortunately, the distribution of a $\text{gamma}(n,1/n)$ (and, correspondingly, and inverse gamma) is different for every $n$.

One could of course proceed by trial and error and this will work perfectly well, if sometimes being a little tedious.

Half a dozen guesses for the lower limit gets us to:

> n=86;1/qgamma(.025,n,scale=1/n)
[1] 1.250202
> n=87;1/qgamma(.025,n,scale=1/n)
[1] 1.248503

While for the other end:

> n=105;1/qgamma(.975,n,scale=1/n)
[1] 0.8332447
> n=106;1/qgamma(.975,n,scale=1/n)
[1] 0.8339306

Which means we need 106 observations to be within the desired limit on both ends and get the desired coverage.

Note that for the lower end we needed a larger sample size to achieve the desired width. [If you are willing to use an asymmetric interval (asymmetric in probability, I mean), in order to get a symmetric-in-percentage-error interval -- i.e. to be substantially more likely to be wrong out one end of the interval than the other -- then you can achieve a smaller sample size while still getting $1-\alpha$ coverage.]

Computing an asymmetric interval

If we try different values of $n$, we can find how much tail area is cut off either side of the +/-20%, and then find the smallest $n$ that totals to no more than $\alpha$ in the tails. In our example, the first place to try is to 'split the difference' and try $n=96$:

 n=96;pgamma(.8,n,scale=1/n);pgamma(1.2,n,scale=1/n,lower.tail=FALSE)
[1] 0.01907025
[1] 0.03033813

Those two tail areas give just under 0.05. Trying n=95 gives more than 0.05 in the tails combined, so n=96 is as small as you can go.

(Note that the 0.03033 value corresponds to the lower tail for the inverse gamma, but it's the probability of the estimate being at least 20% too high)

So if you are happy with more probability of being too high than too low, you can do n=96 for a 95% interval. [A higher coverage ($1-\alpha$) would give a larger $n$, and a lower coverage would give a smaller $n$.]

Simulation example in R to show that this approach is doing what it should. In this example, we calculate the limits ($l$ and $u$) for a symmetric-in-probability (2.5% in each tail) interval based on the 106 just computed:

n=106;l=1/qgamma(.975,n,scale=1/n);u=1/qgamma(.025,n,scale=1/n)
l;u
[1] 0.8339306
[1] 1.221422

Since $l$ and $u$ are within $(0.8333, 1.25)$ if the true mean lies inside that interval the estimate can't be further than 20% from the true mean and by construction, 95% of the time the generated interval includes the true mean, then at least 95% of the time we generate such an interval, the estimate must be within 20% of the true mean.

Now we simulate ten thousand intervals to show it does have the required coverage:

sum(replicate(10000,
    isTRUE(all.equal(c(l,u)*mean(rexp(106,rate=1/3.21))<3.21,c(TRUE,FALSE)))))
[1] 9507

- looks like it's right.

To avoid (or at least greatly reduce) the trial and error (not that it takes more than a few moments), we can make use of an approximation that will get us very close to the right answer - the Wilson-Hilferty transformation:

If $X \sim χ²(k)$ then $\scriptstyle\sqrt[3]{X/k}$ is approximately normally distributed with mean $\scriptstyle 1-2/(9k)$ and variance $\scriptstyle 2/(9k)$.

Related Solutions

Solved – Sample size for binomial confidence interval

(1) Yes.

(2) Yes. There are only $n+1$ possible outcomes for a binomial random variable, so it is possible to look at what happens for each possible outcome - in fact this is faster than simulating lots and lots of outcomes!

Let $X$ be the number of "successes" among the $n$ customers and let $\hat{p}=X/n$. The confidence interval is $\hat{p}\pm z_{\alpha/2}\sqrt{\hat{p}(1-\hat{p})/n}$, so the halfwidth is $z_{\alpha/2}\sqrt{\hat{p}(1-\hat{p})/n}$. Thus we want to compute $P(z_{\alpha/2}\sqrt{\hat{p}(1-\hat{p})/n}\leq 0.005)$. In R, we can do this as follows:

target.halfWidth<-0.005

p<-0.016 #true proportion
n.vec<-seq(from=1000, to=3000, by=100) #number of samples

# Vector to store results
prob.hw<-rep(NA,length(n.vec))

# Loop through desired sample size options
for (i in 1: length(n.vec))
{
n<-n.vec[i]

# Look at all possible outcomes
x<-0:n
p.est<-x/n

# Compute halfwidth for each option
halfWidth<-qnorm(0.95)*sqrt(p.est*(1-p.est)/n)

# What is the probability that the halfwidth is less than 0.005?
prob.hw[i]<-sum({halfWidth<=target.halfWidth}*dbinom(x,n,p))
}

# Plot results
plot(n.vec,prob.hw,type="b")
abline(0.95,0,col=2)

# Get the minimal n required
n.vec[min(which(prob.hw>=0.95))]

The answer is $n=2200$ in this case as well.

Finally, it is usually a good idea to verify that the asymptotic normal approximation interval actually gives the desired coverage. In R, we can compute the coverage probability (i.e. the actual confidence level) as:

p<-0.016
n<-2200
x<-0:n
p.est<-x/n
halfWidth<-qnorm(0.95)*sqrt(p.est*(1-p.est)/n)
# Coverage probability
sum({abs(p-p.est)<=halfWidth}*dbinom(x,n,p))

Different $p$ give different coverages. For $p$ around $0.015$, the actual confidence level of the nominal $90\%$ interval seems to be about $89\%$ in general, which I presume is fine for your purposes.

(3) When you sample from a finite population, the number of successes is not binomial but hypergeometric. If the population is large compared to your sample size, the binomial works just fine as an approximation. If you sample 1000 out of 5000, say, it does not. Have a look at confidence intervals for proportions based on the hypergeometric distribution!

Answers to additional questions:

Let $(p_L,p_U)$ be the confidence interval.

1) In that case you are no longer computing $P(p_L-p_U\leq0.01)$ but $$P\Big(p_L-p_U\leq0.01~\mbox{and}~p\in(p_L,p_U)\Big),$$ i.e. the probability that the length of intervals that actually contain $p$ is at most 0.01. This may be an interesting quantity, depending on what you're interested in...

2) Maybe, but probably not. If the population size is large compared to the sample size you don't need it, and if it's not then the binomial distribution is not appropriate to begin with!

3) Sprop seems to contain confidence intervals based on the hypergeometric intervals, so that should work just fine.

Solved – Confidence intervals vs sample size

In addition to Peter's great answer, here are some answers to your specific questions:

Who to trust will depend also on who is doing the poll and what effort they put into getting a good quality poll. A bigger sample size is not better if the sample is not representative, taking a huge poll, but only in one, non-swing state would not give very good results.

There is a relationship between sample size and the width of the confidence interval, but other things also influence the width, such as how close the percentage is to 0, 1, or 0.5; what bias adjustments were used, how the sample was taken (clustering, stratification, etc.). The general rule is that the width of the confidence interval will be proportional to $\frac{1}{\sqrt{n}}$, so to halve the interval you need 4 times the sample size.
If you know enough about how the sample was collected and what formula was used to compute the interval then you could solve for the standard deviation (you also need to know the confidence level being used, usually 0.05). But the formula is different for stratified vs. cluster samples. Also most polls look at percentages, so would use the binomial distribution.
There are ways to combine the information, but you would generally need to know something about how the samples were collected, or be willing to make some form of assumptions about how the intervals were constructed. A Bayesian approach is one way.

Best Answer

Related Solutions

Solved – Sample size for binomial confidence interval

Solved – Confidence intervals vs sample size

Related Question