I was wondering if there are established rules of thumb (or algorithms) that, given a set of observations can help:
- choose an initial number of class intervals.
- refine that choice to a better number.
I could find talk of using square-root(N), where N is the number of observations as an initial guess of the number of class intervals.
Thanks in advance.
Best Answer
The help of the R command
hist
http://stat.ethz.ch/R-manual/R-patched/library/grDevices/html/nclass.html has some references to algorithms for computing the number of the bins:Sturges, H. A. (1926) The choice of a class interval. Journal of the American Statistical Association 21, 65–66.
Scott, D. W. (1979) On optimal and data-based histograms. Biometrika 66, 605–610.
Freedman, D. and Diaconis, P. (1981) On the histogram as a density estimator: L_2 theory. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 57, 453–476.