Solved – How to find the closest distribution of a given data

distributionsgoodness of fitstochastic-processes

I have inter-arrival times of vehicles recorded by a vehicle detection algorithm. I want to find the closest distribution (e.g., Poisson or other) of this data.

How can I do that?

Here is a graph of the inter-arrival times from a crosspost on SO.

enter image description here

Best Answer

I'd suggest starting with a quick read of the chapter of Law and Kelton's "Simulation Modeling and Analysis" textbook that discusses methods for selecting distributions to use in Monte Carlo simulations. This chapter discusses methods for selecting candidate distributions, fitting the distributions to your data, and then testing the goodness of fit.

It's quite common to find that many different distributions adequately fit your data. Depending on what you're doing with your model, the choice that you make can have a big effect on the results. In that case, it's appropriate to run your simulation with the different distributions to see how sensitive your results are to the assumed distribution.

For interarrival times, it is nearly always the case in practice that the Poisson process (that is, exponential interarrival times but a Poisson distribution for the number of arrivals in a time period) is the way to go. However, the arrival rate may vary (e.g. by day of the week, time of day, and so on.)