Suppose a sequence of n random variables $X_1 \dots X_n$ is generated from a source where the first random variable $X_1$ is determined by a fair coin flip, and subsequent random variables are the complement of the previous if a fair die roll is greater than 4. I.e.
$X_{i} = \bar{X}_{i-1}$ with probability 1/3
$X_{i} = X_{i-1}$ with probability 2/3
such a sequence forms a Markov process with state transition matrix:
$P=\begin{bmatrix} 2/3 & 1/3 \\ 1/3 &2/3 \end{bmatrix}$
which has a stationary distribution of
$\mu = [P(X_i = 0)\quad P(X_i = 1)] = [1/2 \quad 1/2 ]$
Now suppose a second sequence of random variables $Y_1 \dots Y_n$ s.t.:
$Y_1 = X_1$ and $Y_i = XOR(X_i, Y_{i-1})$
I am trying to calculate the joint entropy $H(Y_1,Y_2 \dots Y_n)$
I figure I should use the chain rule to express the joint entropy as:
$H(Y_1,Y_2 \dots Y_n) = H(Y_1) + H(Y_2|Y_1) + \dots H(Y_n|Y_{n-1},\dots Y_2,Y_1)$
The first two terms of this should be fairly straightforward to calculate from the definition of conditional entropy. It is calculating the latter two terms that is giving me trouble. I believe that since I can determine $X_i$ from $Y_{i-1}$ and $Y_{i-2}$ and that I can determine the probability of the next $Y_i$ given the previous $Y_{i-1}$ and $X_{i-1}$ it follows that:
$H(Y_i|Y_{i-1}, Y_{i-2},\dots Y_1) = H(Y_i|Y_{i-1}, Y_{i-2}) = H(Y_i|Y_{i-1}, X_{i-1}) $
The issue I am having is in calculation of $ H(Y_i|Y_{i-1}, X_{i-1}) $ which can be expressed as:
$H(Y_i|Y_{i-1}, X_{i-1}) = -\sum_{y_i,y_{i-1},x_{i-1} }p(y_i, y_{i-1}, x_{i-1}) log(p(y_i| y_{i-1}, x_{i-1}))$
I can not figure out how to compute $p(y_i, y_{i-1}, x_{i-1})$
and was hoping somebody could point me in the right direction, since I've spun my wheels on this for a couple hours so far.
Best Answer
You have computed $H(X_1,\ldots,X_n)$ using the markov chain. Note that for each realisation of $(X_1,\ldots,X_n)$ there is a unique realisation of $(Y_1,\ldots,Y_n)$. In other words, $Y_i=f(X_i,Y_{i-1})$ is a deterministic function so entropy is preserved and $$H(Y_1,\ldots,Y_n)=H(X_1,\ldots,X_n),$$ by induction.