I am trying to write a Monte Carlo simulation in R and I am really stuck! I want to know the probability distribution of a random person in the UK becoming ill from eating a cooked 100g piece of chicken.
I have the following information: out of 1000 pieces of chicken tested 20 had bacteria in question and I have data for the $\log_{10}$ counts of these 20 pieces, I also have min and max $\log_{10}$ counts (0.1 and 3.0). I also know the average person in UK eats 2 x 100g portions of chicken a week. The model for risk of illness given an ingested number of the bacteria is predicted by $R=1-\exp(-aD)$ where $D$ is the ingested number of organisms and I have a value for $a$.
I can write basic Monte Carlo simulations but I am struggling with the start of this one as I can't get my head around the model being ingested bacteria and the question being risk from eating a 100g portion.
- Is my first step here to obtain the CDF?
- And what is the distribution I should use?
Best Answer
Here is what I would do, in a two-steps answer to make things clearer. I suppose you want to compute the annual risk of getting sick (at least once). I propose a simple bootstraping procedure.
First, without resampling
Using your formula $r = 1- e^{-a d}$, you can compute the risk of disease $r_i$ for each of the 1000 pieces of chicken tested. You can estimate the risk $p$ of disease when eating one piece of chicken as the mean of the $r_i$’s. Here is a piece of code for that:
The result is $p = 6.9 \cdot 10^{-7}$. If you estimate that the average person eats $104$ pieces of chicken a year, her/his probability of disease in a year is $1-(1-p)^{104} \simeq 104 p = 7.17 \cdot 10^{-5}$.
Now, let’s resample
First, the risk estimation is dependent of your sample of 1000 pieces of chicken. Let’s resample it:
Then, model the number of chicken pieces the guy eats in a year by a Poisson $\mathcal P(104)$.
The probability of getting sick in a year is then
Just put all that in a loop of length 100000 and record the values, you’ll get a distribution of $p_1$ and $p_2$. You can plot a histogram:
You could also fit a Beta distribution on these...