Solved – Writing a Monte Carlo simulation in R

distributionsmonte carlor

I am trying to write a Monte Carlo simulation in R and I am really stuck! I want to know the probability distribution of a random person in the UK becoming ill from eating a cooked 100g piece of chicken.

I have the following information: out of 1000 pieces of chicken tested 20 had bacteria in question and I have data for the $\log_{10}$ counts of these 20 pieces, I also have min and max $\log_{10}$ counts (0.1 and 3.0). I also know the average person in UK eats 2 x 100g portions of chicken a week. The model for risk of illness given an ingested number of the bacteria is predicted by $R=1-\exp(-aD)$ where $D$ is the ingested number of organisms and I have a value for $a$.

I can write basic Monte Carlo simulations but I am struggling with the start of this one as I can't get my head around the model being ingested bacteria and the question being risk from eating a 100g portion.

  • Is my first step here to obtain the CDF?
  • And what is the distribution I should use?

Best Answer

Here is what I would do, in a two-steps answer to make things clearer. I suppose you want to compute the annual risk of getting sick (at least once). I propose a simple bootstraping procedure.

First, without resampling

Using your formula $r = 1- e^{-a d}$, you can compute the risk of disease $r_i$ for each of the 1000 pieces of chicken tested. You can estimate the risk $p$ of disease when eating one piece of chicken as the mean of the $r_i$’s. Here is a piece of code for that:

d<-c( rep(0,1980), c(1.158469, 2.01743,  1.896469, 1.055511, 1.263673, 1.616196, 
 1.197719, 0.913197, 1.108193, 2.058633, 0.904878, 1.241663, 1.525408, 1.730925, 
 1.143274, 1.200265, 1.103152, 1.465076, 1.838127, 1.162226) )

a <- 0.00005

R <- 1-exp(-a*d)
p <- mean(R);

The result is $p = 6.9 \cdot 10^{-7}$. If you estimate that the average person eats $104$ pieces of chicken a year, her/his probability of disease in a year is $1-(1-p)^{104} \simeq 104 p = 7.17 \cdot 10^{-5}$.

Now, let’s resample

First, the risk estimation is dependent of your sample of 1000 pieces of chicken. Let’s resample it:

d1 <- sample(d,1000,replace=TRUE)
R1 <- 1-exp(-a*d1)
p1 <- mean(R1);

Then, model the number of chicken pieces the guy eats in a year by a Poisson $\mathcal P(104)$.

N <- rpois(1,104)

The probability of getting sick in a year is then

p2 <- 1-(1-p1)**N

Just put all that in a loop of length 100000 and record the values, you’ll get a distribution of $p_1$ and $p_2$. You can plot a histogram:

histograms

You could also fit a Beta distribution on these...