I am trying to find the Kullback–Leibler divergence between Bernoulli Distribution on two points $T, -T$ with parameter $p$ and Gaussian Distribution with mean $\mu$ and variance $\sigma^2$. My attempt is as follows:
Let
$$
b(x) = q\delta(x-T)+p\delta(x+T) \sim \text{Bernoulli}(p) \\ g(x) \sim N(\mu, \sigma^2).
$$
$$
\begin{align}
D(b||g) &= \int_{-\infty}^{\infty}b(x)\log \left( \frac{b(x)}{g(x)}\right) dx \\
&=\int_{-\infty}^{\infty}b(x)\log \left( b(x) \right) dx – \int_{-\infty}^{\infty}b(x)\log \left( g(x) \right) dx \\
&=A-B
\end{align}
$$
My questions are as follows:
- Can I use the continuous representation of Bernoulli RV with the help of $\delta(.)$ functions where $\delta(.)$ is Dirac Delta function?
- Does $A$ exist? Because, on the set $\mathbb{R}-\{\pm T\}$, $\log(\delta(x \mp T))$ is $-\infty$.
- If we cannot calculate the KLD between a continuous and a discrete random variable, what is the KLD analogue for this case? My thought was that $B$ alone can serve as a distance. For example, if we want to measure the distance of $b(x)$ from two different Gaussian distributions $g_1(x), g_2(x)$, only $B$ depends on $g_1(x)$ or $g_2(x)$, and thus can contribute to KLD.
Best Answer