Combining and modifying many probabilities of independent, non-mutually-exclusive events (for a simulation model)

algorithmscombinationsmathematical modelingprobability

Context

Essentially my question is how to modify a calculated probability for non-mutually-exclusive events without having to calculate the probability from scratch every time one of the variables in the equation changes. Let me explain:

I am developing a landscape forest disease model (agent-based model in NetLogo and Go programming languages) that calculates probabilities of infection for each tree as a function of state variables as well as distance from infected trees. Each tree can potentially be infected by thousands of other trees, and the landscape model includes millions of trees, so I am designing the model to calculate probabilities as quickly and efficiently as possible.

I need to calculate the probability that each tree (e.g., tree $i$) is infected by any tree $j$ within the infection transmission distance, i.e., that tree $i$ is infected by tree $j1$ or $j2$ or $j3$. The probabilities are independent (probability of becoming infected by one tree does not influence the probability of being infected by another) and are not mutually exclusive (i.e., tree $i$ can be infected by tree $j1$, or tree $j2$, or both $j2$ and $j3$, etc.).

Because of the complexity of combining probabilities with thousands of possible infection source trees and all of the possible combinations that must be accounted for, e.g.,

$$p_\text{inf,i} = P( j_{1} \cup j_{2} \cup j_{3} \cup … )= P(j_{1}) + P(j_{2}) + P(j_{3}) – P(j_{1} \cap j_{2}) – P(j_{1} \cap j_{3}) – P(j_{2} \cap j_{3}) – P(j_{1} \cap j_{2} \cap j_{3} … ),$$

one person suggested I calculate the combined probability of infection as:

$$p_\text{inf,i} = 1-\prod_{j=1}^{J}{ (1 – p_\text{inf, i,j}) } $$

Probability of infection by tree $j$ is itself calculated as the product of the state-related probability $p_\text{s}$ and the distance-dependent probability $p_\text{d, i,j}$, or $p_\text{inf,j} = p_\text{s}*p_\text{d, i,j}$. So the probability of infection equation displayed above can be re-written as:

$$p_\text{inf,i} = 1-\prod_{j=1}^{J}{ (1 – p_\text{s}*p_\text{d, i,j}) } $$

To be clear, the value of $p_{s}$ at a given time is the same for a probability of infection calculation between tree $i$ and any tree $j$, whereas the value of $p_\text{d, i,j}$ will differ between each tree pair depending on the distance between them.

Question

If the state of tree $i$ changes from one time step to another (e.g., the tree dies), the value of the probability $p_\text{s}$ will change in every pair-wise probability of infection calculation, and the p_inf will have to be recalculated by substituting the new value of $p_{s}$ for the old value of $p_{s}$. However, because of the amount of time it takes for the full set of calculations due to the large number of trees, I would like to just modify the calculated p_inf to account for the change (e.g., using some factor that relates the old and new $p_{s}$ like the ratio between them ${p_\text{s,t} \over p_\text{s,t-1}}$ ), but I cannot figure out how to do that or if this is mathematically possible. I want to avoid having to repeat the calculation for trees that have already been accounted for because it takes too much time given that there are potentially many millions of calculations happening during a model run.

How can I modify the calculated value of $p_\text{inf, i}$, the probability of many independent, not mutually exclusive events, when the value of $p_\text{s}$ changes without recalculating the pairwise $p_\text{inf, i,j}$ for every tree pair? Is this possible?

I hope this is clear and coherent enough, and please forgive any errors in my equation writing. I am clearly not a mathematician.

Best Answer

I found an approximation that is accurate enough for my application based on a publication by Ursini and Martins (2017). It works best for large $n$ and low probabilities (i.e., the closer to 0, the better).

For this approximation, one calculates the mean probability for each event in the union:

$$\bar{p} = \frac{1}{n} \sum{p_i}_{i=1}^{n}$$

Then, one calculates the approximation of the union as:

$$P = 1-(1-\bar{p})^n$$

Related Question