Joint Entropy of a sequence of random variables resulting from XORing with a Markov process at stationary distribution

conditional probabilityinformation theoryprobability distributions

Suppose a sequence of n random variables $X_1 \dots X_n$ is generated from a source where the first random variable $X_1$ is determined by a fair coin flip, and subsequent random variables are the complement of the previous if a fair die roll is greater than 4. I.e.

$X_{i} = \bar{X}_{i-1}$ with probability 1/3

$X_{i} = X_{i-1}$ with probability 2/3

such a sequence forms a Markov process with state transition matrix:

$P=\begin{bmatrix} 2/3 & 1/3 \\ 1/3 &2/3 \end{bmatrix}$

which has a stationary distribution of

$\mu = [P(X_i = 0)\quad P(X_i = 1)] = [1/2 \quad 1/2 ]$

Now suppose a second sequence of random variables $Y_1 \dots Y_n$ s.t.:

$Y_1 = X_1$ and $Y_i = XOR(X_i, Y_{i-1})$

I am trying to calculate the joint entropy $H(Y_1,Y_2 \dots Y_n)$

I figure I should use the chain rule to express the joint entropy as:

$H(Y_1,Y_2 \dots Y_n) = H(Y_1) + H(Y_2|Y_1) + \dots H(Y_n|Y_{n-1},\dots Y_2,Y_1)$

The first two terms of this should be fairly straightforward to calculate from the definition of conditional entropy. It is calculating the latter two terms that is giving me trouble. I believe that since I can determine $X_i$ from $Y_{i-1}$ and $Y_{i-2}$ and that I can determine the probability of the next $Y_i$ given the previous $Y_{i-1}$ and $X_{i-1}$ it follows that:

$H(Y_i|Y_{i-1}, Y_{i-2},\dots Y_1) = H(Y_i|Y_{i-1}, Y_{i-2}) = H(Y_i|Y_{i-1}, X_{i-1}) $

The issue I am having is in calculation of $ H(Y_i|Y_{i-1}, X_{i-1}) $ which can be expressed as:

$H(Y_i|Y_{i-1}, X_{i-1}) = -\sum_{y_i,y_{i-1},x_{i-1} }p(y_i, y_{i-1}, x_{i-1}) log(p(y_i| y_{i-1}, x_{i-1}))$

I can not figure out how to compute $p(y_i, y_{i-1}, x_{i-1})$

and was hoping somebody could point me in the right direction, since I've spun my wheels on this for a couple hours so far.

Best Answer

You have computed $H(X_1,\ldots,X_n)$ using the markov chain. Note that for each realisation of $(X_1,\ldots,X_n)$ there is a unique realisation of $(Y_1,\ldots,Y_n)$. In other words, $Y_i=f(X_i,Y_{i-1})$ is a deterministic function so entropy is preserved and $$H(Y_1,\ldots,Y_n)=H(X_1,\ldots,X_n),$$ by induction.

Related Question