[Math] What structure is needed to define a Gaussian distribution on a given space

pr.probabilityprobability distributions

In most textbooks, the normal distribution is defined on $\mathbb{R}^n$ by specifying its probability density function. This works perfectly well, but it isn't really amenable to generalisation.

I'm wondering what the minimal structure is that one must have on a given space $S$ before one can define an analogue of the normal distribution. If $S$ has a Riemannian metric defined on it, then one can define a Brownian motion on $S$ using the Laplace-Beltrami operator. The family of normal distributions on $S$ could then be defined as the one-dimensional marginals of the Brownian motion.

Alternatively, the normal distribution could be characterised as the distribution that maximises entropy when its mean and variance are known. This characterisation only seems to require a notion of "mean" (which suggests that $S$ must be a metric space).

Is there a more general construction of the normal distribution?

Best Answer

Let me assume that you seek for the generalization of Gaussian distribution in order to generalize the Brownian motion.

As far as I know, regarding the heat kernel as the generalization of the Gaussian distribution has long been adopted in many literatures. It comes from the following observation.

In $\mathbb {R}^1$, the following notions coincide:

(1) Gaussian distribution $N(x,t)\sim f(t,x,y)=\frac {1}{\sqrt{2\pi t}}e^{\frac{-(y-x)^2}{2t}}$,

(2) transition function $p(t,x,y)$ of the Brownian motion $B_t$,

(3) (heat kernel) fundamental solution $k_t(x,y)$ of the heat equation $\partial_t k=\Delta_y k$, with initial data $\delta_x$.

Thus, on manifolds, one way to define the Brownian motion is to construct a Markov process on the manifold whose transition function is exactly the heat kernel (let's identify the heat kernel with the Gaussian distribution in this setting). Since we always have the Laplacian-Beltrami $\Delta$ on a manifold, it is justifiable to talk about the heat equation and thus the heat kernel, and the Brownian motion in this sense is known to exist for a large class of manifolds.

But on metric spaces, we no longer have the Laplacian-Beltrami. So, in order to talk about heat kernel/Gaussian distribution, we need to generalize the notion of Laplacian-Beltrami. The key concept on this line the so-called Dirichlet form. A Dirichlet form on metric measure space $(X,d,\mu)$ a closed symmetric form $(\cdot,\cdot)$ defined on $L^2(X,\mu)$. It should further satisfy a couple of conditions so that it behaves like its prototype $(f,g)=\int_{M} {\nabla f\cdot \nabla g dx}$ on a manifold $M$. Notice that $(f,g)=(-\Delta f,g)_{L^2(M)}$ on manifolds, in the general case, one obtains the desired "Laplacian" by the same formula. Therefore, every Dirichlet form corresponds to a "Laplacian" and thus a Gaussian distribution (and thus a Brownian motion). What's more, a reasonable Dirichlet form always exists provided the space is suitably good.

In sum, if the space you are considering have both metric and measure structures, then the theory of Dirichlet form may provide you some satisfactory results regarding construction and properties of the Guassian distribution (and thus the Brownian motion). Roughly speaking, if we don't have a presumed measure, we may not be able to construct a reasonable probability space; if we don't have a metric, it would be hard to measure the regularity and decay of the Gaussian distribution. So metric measure structure might be the minimal structure for reasonable construction of Gaussian distribution.

Some reference books could be found in the above link. This paper by Sturm may allow you to have a glance at the whole picture. I am not an expert in this field. I apologize in advance for any mistake and naivety.