Solved – Finding precision of Monte Carlo simulation estimate

confidence intervalmonte carlosimulationstandard error

Background

I am designing a Monte Carlo simulation that combines the outputs of series of models, and I want to be sure that the simulation will allow me to make reasonable claims about the probability of the simulated outcome and the precision of that probability estimate.

The simulation will find the probability that a jury drawn from a specified community will convict a certain defendant. These are the steps of the simulation:

  1. Using existing data, generate a logistic probability model (M) by regressing “juror first ballot vote” on demographic predictors.

  2. Use Monte Carlo methods to simulate 1,000 versions of M (i.e., 1000 versions of the coefficients for the model parameters).

  3. Select one of the 1,000 versions of the model (Mi).

  4. Empanel 1,000 juries by randomly selecting 1,000 sets of 12 “jurors” from a “community” (C) of individuals with specified demographic characteristic distributions.

  5. Deterministically calculate the probability of a first ballot guilty vote for each juror using Mi.

  6. Render each "juror’s" probable vote into a determinate vote (based on whether it is greater or less than randomly selected value between 0-1).

  7. Determine each "jury’s" “final vote” by using a model (derived from empirical data) of the probability a jury will convict, conditional on the proportion of jurors voting for conviction on the first ballot.

  8. Store the proportion of guilty verdicts for the 1000 juries (PGi).

  9. Repeat steps 3-8 for each of the 1,000 simulated versions of M.

  10. Calculate the mean value of PG and report that as the point estimate of the probability of conviction in C.

  11. Identify the 2.5 & 97.5 percentile values for PG and report that as 0.95 confidence interval.

I am currently using 1,000 jurors and 1,000 juries on the theory that 1,000 random draws from a probability distribution—demographic characteristics of C or versions of M—will fill out that distribution.

Questions

Will this allow me to accurately determine the precision of my estimate? If so, how many juries do I need to empanel for each PGi calculation to cover C's probability distribution (so I avoid selection bias); may I use fewer than 1,000?

Thank you so much for any help!

Best Answer

There is one general and "in-universe" criterion for goodness of Monte Carlo -- convergence.

Stick to one M and check how the PG behaves with the number of juries -- it should converge, so will show you a number of repetitions for which you will have a reasonable (for your application) number of significant digits. Repeat this benchmark for few other Ms to be sure you wasn't lucky with M selection, then proceed to the whole simulation.

Related Question