Probability – Proper Notation for Derived Probability Measures in Measure Theory

distributionsmeasure-theorynotationprobability

Suppose that I have a random variable $X$, which is a variable-dimensional (i.e. it could be 1 dimensional, 2 dimensional, 3 dimensional, etc.). The dimension of a specific $x$ is known and denoted as $d(x)$. The distribution of $X$ can be discrete, continuous, or mixed, and I use the measure $\mathbb P_X$ to refer to it.

Now, I am interested in the indexed entry $Y=(X, I)$, where $1\leq I \leq d(X)$. In other words, each $Y$ is a joint vector containing the full vector $X$ and an index $I$. Suppose I know that all the $Y$s corresponding to the same $X$ have uniform distribution (i.e. with probability $1/d(X)$). Thus, I can sample $Y$ from the following procedure:

  1. $X\sim \mathbb P_X$
  2. $I\sim \mathrm{Unif}(\{1, …, d(X)\})$
  3. $Y=(X, I)$

What is the formal way of writing down the general probability measure notation for $Y$? Specifically, I want to express $\mathbb P_Y = ?$. The tricky part is that whether $Y$ is continuous or discrete depends on that of $X$, and ideally I want a unified notation to cover both cases (and the case of hybrid distribution). I am thinking about something like $\mathbb P_Y=\frac{1}{d(X)}\mathbb P_X$, but this doesn't seem correct as it has an "undefined" variable $X$ and I don't even think I can divide a probability measure by something.

Best Answer

In order to stress that some of your objects are vectors, I am going to call them $\mathbf{Y}$ and $\mathbf{X}$ in bold notation.

We usually don't try to write probability measures explicitly, since they are functions defined on a sigma-field, and this makes them cumbersome to write in explicit form. When we want to set out explicit results for a situation where we may have discrete or continuous (or mixed) random variables, we usually do this using the cumulative distribution function.


Simplifying the distribution: In your case, the main challenge here is dealing with a random vector with variable dimension. A simple and effective way to deal with this is to define this random vector using two parts: a sequence giving all elements when the dimension is infinite; and a random variable determining the actual dimension. To do this, suppose we define a random sequence $\mathbf{X}_\infty = (X_1,X_2,X_3,...)$, a random dimension $D$ and a random index $I$ and we then define:

$$\begin{align} \mathbf{X} &\equiv (X_1,...,X_D), \\[6pt] \mathbf{Y} &\equiv (X_1,...,X_D,I). \\[6pt] \end{align}$$

The random object $(\mathbf{X}_\infty, D)$ fully determines $\mathbf{X}$ and the random object $(\mathbf{X}_\infty, D, I)$ fully determines $\mathbf{Y}$. We can characterise the distribution of interest by noting the conditional independence $I \ \bot \ \mathbf{X}_\infty | D$ and then decomposing it into the following parts:

$$\begin{align} p(d) &\equiv \mathbb{P}(D=d), \\[10pt] p_d(i) &\equiv \mathbb{P}(I=i |D=d) = \frac{\mathbb{I}(i \in \{1,...,d\})}{d}, \\[12pt] F_d(\mathbf{x}_\infty) &\equiv \mathbb{P}(\mathbf{X}_\infty \leqslant \mathbf{x}_\infty|D=d). \\[6pt] \end{align}$$

This gives us some structure that makes it simpler to write the joint distribution of interest, leading to the probability measure of interest.


Writing the distribution of interest: Now, let $\mathbf{y} = (\mathbf{x}, i) = (x_1,...,x_d, i)$ denote a specific value for the outcome $\mathbf{Y}$. Regardless of whether the values in $\mathbf{X}_\infty$ are discrete or continuous (or mixed) we can encompass all information of interest using the following function:

$$\begin{align} H_\mathbf{Y}(\mathbf{y}) &\equiv \mathbb{P}(\mathbf{X} \leqslant \mathbf{x}, I = i) \\[6pt] &= \mathbb{P}(\mathbf{X} \leqslant \mathbf{x}, I = i, D = d) \\[6pt] &= \mathbb{P}(\mathbf{X} \leqslant \mathbf{x}| I = i, D = d) \cdot p_d(i) \cdot p(d) \\[6pt] &= F_d((\mathbf{x},\infty,\infty,\infty,...)) \cdot p_d(i) \cdot p(d). \\[6pt] \end{align}$$

This function is a mixture of a CDF (with respect to $\mathbf{X}$) and a mass function (with respect to $I$). It uniquely determines the probability measure $\mathbb{P}_\mathbf{Y}$. Specifically, this function allows you to define the pre-measure:

$$\mathbb{P}_\mathbf{Y}((-\infty, \mathbf{x}] \times (-\infty,i]) = \sum_{j=1}^{\lfloor i \rfloor} H_\mathbf{Y}((\mathbf{x},j)),$$

and the full probability measure is then uniquely defined using the Carathéodory extension theorem.

Related Question