I am reading the Wikipedia article on conditional independence. There seems to be Two definitions for conditional independence of Two random variables $X$ and $Y$ given another one $Z$:
- Two random variables $X$ and $Y$ are
conditionally independent given a
third random variable $Z$ if and
only if they are independent in
their conditional probability
distribution given $Z$. That is, $X$
and $Y$ are conditionally
independent given $Z$ if and only
if, given any value of $Z$, the
probability distribution of $X$ is
the same for all values of $Y$ and
the probability distribution of $Y$
is the same for all values of $X$. -
Two random variables $X$ and $Y$ are
conditionally independent given a
random variable $Z$ if they are
independent given $\sigma(Z)$: the
$\sigma$-algebra generated by $Z$.Two events $R$ and $B$ are
conditionally independent given a
$\sigma$-algebra $\Sigma$ if
$$\Pr(R \cap B \mid \Sigma) = \Pr(R \mid \Sigma)\Pr(B \mid
\Sigma)\ a.s.$$ where $\Pr(A \mid
\Sigma)$ denotes the conditional
expectation of the indicator
function of the event $A$, given the
sigma algebra $\Sigma$. That is,
$$ \Pr(A \mid \Sigma) :=
\operatorname{E}[\chi_A\mid\Sigma].$$Two random variables X and Y are
conditionally independent given a
$\sigma$-algebra $\Sigma$ if the
above equation holds for all $R$ in
$\sigma(X)$ and $B$ in $\sigma(Y)$.
I can understand the second definition, but my questions are:
- What does the first definitions mean
actually? I have tried several times
on reading it, but fail to get what
it means? Can someone rephrase it
using rigorous and clean language,
for example, by writing the definition in terms of some formulae? - Do the two definitions agree with
each other? Why? - ADDED: I was wondering if the following is the correct way to understand the first definition. Notice that $P(\chi_A \mid Z)$ is defined as $E(\chi_A \mid Z)$ and therefore is a random variable. When the conditional probability $P(\cdot \mid Z)$ is "regular", i.e. when $P(\cdot \mid Z)(\omega)$ is a probability measure for each point $\omega$ in the underlying sample space $(\Omega, \mathcal{F}, P)$, does conditional independence between $X$ and $Y$ given $Z$ mean that $X$ and $Y$ are independent w.r.t. every probability measure defined by $P(\cdot \mid Z)(\omega), \forall \omega \in \Omega$? If yes, is the conditional probability $P(\cdot \mid Z)$ always guaranteed to be "regular"? So that there is no need to explicitly write this "regular" assumption?
Thanks and regards!
Best Answer
The first definition is the informal one, but at the same time seems rather convoluted to me.
I'd prefer: X and Y are conditionally independent with respect to a given Z iff
$P(X \; Y | Z) = P(X | Z ) P(Y | Z)$
Recall that conditioning one (or several) variables on the value of another, is (informally) the same as restricting the whole universe to a part of it. Then, if you are given the value of $Z$, you can think as if you are defining new variables that are the same as the unconditioned but that are restricted to our new (smaller universe) $X' \equiv X | Z$ $Y' \equiv Y | Z$ The above formula simply states that $X'$ and $Y'$ are independent.
The first definition says the same, but applying (in words) the property that two variables are independent iff their conditioned probabilities are the same as the unconditioned : $A$ indep $B$ iff $P(A | B ) = P (A)$