Data Processing Inequality for sufficient statistic case

information theoryprobabilitystatisticssufficient-statistics

Consider a Markov chain $ X \rightarrow Y \rightarrow Z $ and assume $Z$ is a sufficient statistic for $X$ (i.e $I(X;Y)=I(X;Z)$), do we have a case for $X, Y$ and $Z$ where $H(Y) > H(Z)$?

Here is my attempt:
Since $Z$ is a sufficient statistic for $X$, therefore:
$I(X;Y)=I(X;Z) \rightarrow H(X|Y)=H(X|Z)$. If we assume $X$ is a Bernoulli random variable with parameter ($\alpha$), and based on $X$ we have two Bernoulli random variables for $Y$ with parameters $\beta$ and $\gamma$ (like flipping two different coins based on the observation of $X$), and a similar random variables for $Z$ with different parameters, then we can design parameters that satisfy ($H(Y) > H(Z)$ or $H(Z) > H(Y)$) regarding $H(X|Y)=H(X|Z)$. However, this needs actually a lot of computations, I wonder is there any shorter solution or maybe better intuition for this?

Best Answer

Basically, if you put some unrelated information in $Y$, then you'll have your inequality. For instance, suppose you get $Z$ by pushing $X$ through a channel. Let $W$ be independent of $X$ and of $Z$. Then take $Y = (Z,W)$.

$X - Y - Z$ holds because $$ P(Z = z | Y = (z',w), X = x) = \delta_{z,z'} = P(Z = z| Y = (z',w)).$$ Also, $$ I(Y;X) = I(Z;X) + I(W;X|Z) = I(Z;X), $$ since $H(W|X,Z) = H(W|Z) = H(W)$ due to the independence.

But now $H(Y) = H(Z) + H(W) > H(Z)$.

Related Question