Solved – Comparing two gaussian distributions

distancenormal distribution

Apologies if this is a really simple question; I'm sure if only I knew what to google I'd be able to find the answer myself, but it's been driving me mad.

I have two datasets with approximately gaussian distributions. Both are measurements of the same background distribution, taken for reproducibility of some optical instrumentation I've developed. I need to prove this using the two measurements.

My understanding is that to achieve this, I integrate common area that's underneath both measured distributions. However…

In my case, gaussian 1 has a mean of 41.3 and a standard deviation of 1.0. Gaussian 2 has a mean of 41.7 and a standard deviation of 1.6. This means that the two gaussians intersect twice.

When I integrate the common area, I get 0.76, which I interpret to mean there's a 0.76 probability that the two measurements are of the same background distribution. This sounds way too low to me.

I had a look at KL divergence, but this is asymmetric and assumes that one of the measured distributions is the 'true' distribution – this is not the case for my measurements.

I have some more similar comparisons with more than two measured distributions to worry about, but I'd like to walk before trying to run…

Best Answer

What you are looking for is a two-sample test for equality of distribution. There are a number of known tests of this kind, including the Wald-Wolfowitz two-sample runs test, the Friedman-Rafsky two-sample runs test, the Kolmogorov-Smirnov two-sample test, the Henze nearest neighbour test, and the Zeck-Aslan minimum energy test. There are probably many others, but these will get you started. The Kolmogorov-Smirnov two-sample test is a particularly common test which is easy to implement and which has an explicit formula to estimate the p-value.

Related Question