[Math] how far the distribution from the uniform distribution

distribution-theorydivergence-operatorprobability distributionsuniform distribution

I have two discrete probability distributions $P$ and $Q$, where $P=(p_1,…,p_n)$ and $Q=(q_1,…,q_n)$, in addition I have uniform distribution $U=(\frac{1}{n},…,\frac{1}{n})$.

The question is how to measure which distribution $P$ or $Q$ is the closest to uniform distribution.

I am not sure if I can use Kullback–Leibler divergence because is not a "correct" distance. Also, I don't know if I can use entropy.

Best Answer

Total variation distance, also known as statistical distance, is a good metric (very stringent). (Note that up to a factor $2$, it's equivalent to $\ell_1$ distance between the vectors of probabilities.) It also has a nice interpretation in terms of closeness of events.
$\ell_2$ will be much more forgiving towards small differences, and put the emphasis on outliers.
Hellinger also has some nice properties, and interpretation (although is maybe less commonly used).
Kolmogorov distance (equivalently., the $\ell_\infty$ distance between CDFs) will make sense if your domain $\{1,\dots,n\}$ has a meaningful order on it.

All of these (and more, e.g. Wasserstein/Earthmover) are valid choices -- ultimately, it'll depend on your application.

A good resource: Distances and affinities between measures, Chapter 3 of Asymptopia by Pollard. "On choosing and bounding probability metrics" by Gibbs and Su is also a recommended read.

Best Answer

Related Solutions

Equivalence Between Uniform and Normal Distribution

[Math] Similarity between two probability distribution

Related Question