[Math] Convolution of two probability distributions

convolutionprobabilityprobability distributionsprobability theory

I would like your help to find the correct definition of convolution of two probability distributions. I found several references on that, but very complicated, involving random variables, and more. At this stage I would like just an explanation on what the operator convolution consists of.

(similar question here but no answer)

For example, consider the following mixture of 2 CDFs
$$
F(x)=\lambda G(x-\mu_1)+(1-\lambda)G(x-\mu_2)
$$

where $\lambda\in [0,1]$.

Define
$$
\Delta(x; \mu_1, \mu_2; \lambda)\equiv \lambda\times 1\{\mu_1\leq x\}+(1-\lambda)\times 1\{\mu_2\leq x\}
$$

I found a reference claiming
$$
F(\cdot)=G(\cdot)*\Delta(\cdot; \mu_1, \mu_2; \lambda)
$$

where $*$ is the convolution symbol.

Which operation is $*$ performing?

Best Answer

$\require{begingroup}\begingroup\renewcommand{\dd}[1]{\,\mathrm{d}#1}$There's no page 286 in the project Euclid paper, I think you mean page 226.

tl;dr This is just a case of sloppy language/notation.

The authors use the notion of convolution just as a highbrow way to shift $G(x)$ the base CDF to $G(x - \mu_j)$, and this really has nothing to with probability (the usual addition of independent random variables).

With $G$ being zero-symmetric as in the paper, let me use a new notation $S_j$ for the Dirac delta function $S_{j}(z)= \delta(z - \mu_j)$. This is a peak of mass $1$ at $\mu_j~$, where the arguement $z - \mu_j$ vanishes (is zero).

The shift of $G$ is done by the convolution ($S$ stands for shift)

\begin{align} (G * S_j)(x) &= \int_{t = -\infty}^{\infty}G(t)\, S(x - t) \dd{t} & &\text{, the usual definition of convolution} \\ &= \int_{t = -\infty}^{\infty}G(t)\, \delta\bigl(x - t - \mu_j\bigr) \dd{t} &&\text{, just definition of $S$} \\ &= \int_{t = -\infty}^{\infty}G(t)\, \delta\Bigl(- \bigl(t - (x - \mu_j) \bigr) \Bigr) \dd{t} &&\text{, find which $t$ makes the whole argument vanish}\\ &= G(x - \mu_j) \end{align}

As shown here, what the author of the papers are talking about is the convolution of $G$ with Dirac delta (a peak), NOT the convolution of $G$ with the "distribution function (CDF) $\Delta_k$" that is a (convex combo of) step function.

Basically they got sloppy in the language and started doing things "verbally". They also made the unfortunate choice of notation with $\delta_{\mu_j}$ to represent the step functions (my $S_j$ is equivalent to their $\delta_{\mu_j}$), which in itself is okay but totally confusing when coupled with the mathematically erroneous of expression $G * \Delta_k$.

In fact throughout the entire paper there's no place when they actually carry out a calculation of convolution explicitly. The properties of convolution they used are independent to what $\Delta_k$ actually is, therefore the error stays implicit.

I bet if you ask them "how does $G$ convolving with another CDF ($\Delta_k$) result in a yet another legit CDF", they'll respond: "oh, of course it's understood as convolution with the density for that distribution function $\Delta_k$. This is trivial and you should know that."

$\endgroup$

Related Question