Solved – How to perform a non-equi-spaced histogram in R

histogramr

From the R docs for hist:

R's default with equi-spaced breaks (also the default) is to plot the
counts in the cells defined by breaks. Thus the height of a rectangle
is proportional to the number of points falling into the cell, as is
the area provided the breaks are equally-spaced.

The default with non-equi-spaced breaks is to give a plot of area one,
in which the area of the rectangles is the fraction of the data points
falling in the cells.

So .. how do I get hist to plot non-equi-spaced breaks? It sounds as if it will calculate the breaks to end up with area one, but I don't see the options.

Edit: Also, what are recommended ways (in R) to do non-equi-spaced histograms? A typical case would be data that is spiky, causing all the action in one or a few cells, no matter how many are given as "breaks". Another would be two areas of activity separated by a large area of zero, meaning no matter how many breaks, all you see is flat, with two huge narrow spikes. Or perhaps worse, one area of activity, then another area of much less activity far away that causes the graph to be very wide and flat.

Best Answer

You will notice that there is an argument breaks as a part of the function hist(), with the default set to "Sturges". You can also set your own breakpoints and use them instead of the default sturges algorithm as follows:

breakpoints <- c(0, 1, 10, 11, 12)
hist(data, breaks=breakpoints)

If you read all the way down to the bottom, there are a couple of examples with non-equidistant breaks as well.

Update: This may not be a direct answer to your question, but you could use a different approach (i.e., graph) than a histogram. Personally, I don't find histograms terribly useful. Instead you could try a kernel density plot, which I think would address the first two cases you list (I don't see how you can get out of the third). In R, the code would be: plot(density(data)).

Related Question