Solved – Annual Probability of Occurrence

probability

I am trying to determine, based on historical record, what the annual probability of occurrence would be for a specific hazard. Take tornadoes. Over a 59 year time span I have documented 45 tornadoes. Some years, across that 59 year span, saw no events occur which leaves me with 25 years out of 59 that had tornado occurrences. So how would I determine what my annual probability of an occurrence would be (ie: 10% chance of tornado a year)….
I am not a mathematician, so please explain in a way that I can apply the formula to other historical events.

I appreciate any help you can give.

Best Answer

These kinds of calculations require assumptions, in effect a model.

Probably the most important for the example will be whether you assume

(i) independence

(ii) constant risk

in practice neither is likely to be true

For example, if you have 25 years out of 59 with at least one tornado, you can try to apply a binomial model, but is it really the case that the probability of tornadoes in a given year is the same as other years?

Or you could take the 45 tornadoes in 59 years and try to apply a Poisson model, but is it really the case that tornadoes occurrences are independent (and occur at constant rate)? Or do they tend to cluster?

IF you could make the constant rate/probability and independence assumptions these calculations are straightforward:

Binomial model: Let $p$ be the underlying constant probability of at least one tornado in a year. We estimate it by the sample proportion, $\hat{p}=\frac{25}{59}= 0.424$.

You can also form confidence intervals for the proportion as here.

Poisson model: Here we have tornadoes occurring "at random", as in a Poisson process. The estimated rate of occurrence of tornadoes is simply the average annual rate, $\hat\lambda= \frac{45}{59}$ per year (that is, on average we saw 0.763 tornadoes per year, and that's our best estimate of the rate under the model. For this model, the estimated P(at least one tornado in a year) = $1 - P(0\, \text{ tornadoes})$ $ = 1-\exp(-\hat\lambda) = 0.534$

Again, it's possible to compute an interval here (e.g. by computing an interval for $\lambda$ and then transforming it); it's a bit more complicated.

The problem is that both methods make assumptions that are unlikely to be tenable - but when we break them, we need, again, to assume some kind of model, but it's not up to me to tell you what your model should be -- it depends on domain knowledge I am unlikely to possess. What you're likely to see in a more sophisticated model is that P(0) increases (while the mean number of tornadoes is still the same), implying that P(at least 1 tornado) is probably lower than in either of the models... but P(more than 1|at least 1) will tend to be higher (i.e. that tornadoes are 'clumpy').

Related Question