[Math] What does double vertical bars notation mean in probability

notationprobabilityprobability theory

I'm studying on generative adversarial networks and I've come across the following formula for inception score :
$$\text{IS}(G) \approx \exp \left(\frac{1}{N}\sum_{i=1}^ND_{KL} \left(p \left(y\vert x^{(i)} \Vert \hat{p} \left(y \right) \right) \right) \right).$$
I just want to know what does those $\Vert$ (Double vertical bars) mean in that formula?

Best Answer

I do not know the details of the adversarial networks however I can offer a general answer for probability theory which might be close to the answer.

In a measure-theoretic setting $P(A||\mathscr{G})$ is sometimes written to denote the conditional probability of the event $A$ with respect to the $\sigma$-field $\mathscr{G}$ where $P$ is a probability measure on the measurable space $(\Omega,\mathscr{F})$ where $\mathscr{F}$ is a larger $\sigma$-field satisfying $\mathscr{G}\subseteq\mathscr{F}$. Random variables $Y$ and $X$ can generate such a $\sigma$-fields, say $X$ generates $\mathscr{G}$ and $Y$ generates $\mathscr{F}$, then $P(A||\mathscr{G})=P(Y\in A||X)$. The specific relationship satisfied is

$$\int_{G}P(Y\in A||X)dP=P(\{Y\in A\}\cap\{X\in G\}) \hspace{10pt}\text{for all}\hspace{10pt} G\in\mathscr{G}\hspace{10pt}(1)$$

The $\hat{p}(y)$ in your equation probably (I am guessing here) denotes an estimate using a sample of random data $Y$ observed at $Y=y$. This estimate $\hat{p}(y)$ will be a random variable so perhaps all of the above will apply and the $||$ notation simply hints at the measure-theoretic machinery I allude to.

In the special case where

$$\int_{G}P(Y\in A||X)dP=P(Y\in A||X)\int_{G}dP\\=P(Y\in A||X)P(X\in G)$$

then the above equation reduces to

$$P(Y\in A||X)dP=P(\{Y\in A\}\cap\{X\in G\})/P(X\in G)\\=P([\{Y\in A\}\cap\{X\in G\}]|X\in G)$$

using the traditional $|$ notation signifying the $P(A,B)/P(B)=P(A|B)$ definition. In general the two definitions do not coincide - I believe $\mathscr{G}$ being generated by a countable class $\mathcal{A}$ might be a sufficient condition, that is $\mathscr{G}=\sigma(\mathcal{A})$ where $|\mathcal{A}|=\aleph_{0}$.

Related Question