Solved – How to merge histograms with different bin counts and different ranges

density functiondistributionshistogram

The assumption is that we are unable to access the underlying data of the original histograms, but we do know the number of observations in each histogram. For the case that the original histograms have the same range and the same bin counts, we can simply add the frequency of each bin from all the original histograms. However, the story is more complicated when the range and the bin counts are different.

Are there established methods for merging histograms with different range and different bin counts?

Are there any existing research interest in this topic?

Extra question: After merging the histograms, how to quantify the uncertainty of the new histogram?

Best Answer

The histogram is a density estimator! Assuming you have expressed the two histograms in this way, that is, the y-axis is expressed in density units (density is probability per unit along the x-axis), then we can express the combined histogram as a mixture density of the two given histograms. Let $f_1(x), f_2(x)$ be the two given histograms, with sample sizes $n_1, n_2$ and $n=n_1 + n_2$. Then the combined histogram $f(x)$ is $$ f(x) = \frac{n_1}{n} f_1(x) + \frac{n_2}{n} f_2(x) $$ Others will have two chime in on the Q about uncertainty.