Solved – Are $\mathbb{F}_2$-linear combinations of random variables in an i.i.d. Bernoulli process again an i.i.d. Bernoulli process

bernoulli-distributionbinomial distributionmathematical-statistics

I'm having trouble understanding how certain combinations of random variables can correlate. The problem is as follows:

I have a binary $m \times n$ matrix $A$ of full rank (over the finite field $\mathbb{F}_2$ with two elements), where each row has exactly $w$ $1$'s (i.e., their weights are all $w$).

I randomly pick an $n$-dimensional binary vector $\boldsymbol{v} = (v_0,\dots,v_{n-1}) \in \mathbb{F}_2^n$, where each $v_i$ is $1$ with probability $p$ and $0$ with probability $1-p$. The probability $p$ is assumed to be strictly smaller than $\frac{1}{2}$. Here, $v_i$ are all independent, so that picking $\boldsymbol{v}$ is seen as an i.i.d. Bernoulli process of probability $p$.

I am going to hash this vector $\boldsymbol{v}$ into an $m$-dimensional vector $\boldsymbol{v}' = (v'_0,\dots,v'_{m-1})$ by taking $\boldsymbol{v}' = A\boldsymbol{v}^T$ over $\mathbb{F}_2$. So, if I write the $i$th row of $A$ as $\boldsymbol{r}_i$, each $v'_i$ is the product $v'_i = \boldsymbol{r}_i\cdot\boldsymbol{v}^T$ between the $i$th row of $A$ and $\boldsymbol{v}$. In other words, I'm just taking the mod $2$ sum of some bits of $\boldsymbol{v}$.

Because the weight of $\boldsymbol{r}_i$ is assumed to be $w$, the probability $P(v'_i = 1)$ that the $i$th bit $v_i'$ of the hashed vector $\boldsymbol{v}'$ is $1$ is

$$P(v'_i = 1) = \sum_{x: \text{odd}}^{w}\binom{w}{x}p^x(1-p)^{w-x},$$

which is a constant if we fix $p$ and $w$, and is also uniform across all $v_i'$.

Now, if $A$ had linearly dependent rows, the hashed bits $v_i'$ are clearly not independent. My question is:

What if all rows of $A$ are linearly independent? Is $\boldsymbol{v}'$ an i.i.d. Bernoulli process with probability $P(v'_i = 1)$? If not, can I approximate it as one if $\boldsymbol{r}_i$ and $\boldsymbol{r}_j$ have only few $1$'s at the same columns for all $i \not=j$?

I remember I read in some research paper in electrical engineering that $\boldsymbol{v}'$ is i.i.d. Bernoulli if rows of $A$ are linearly independent, although I can't seem to remember where it was. (I found one recent paper that says this is the case: V. Toto-Zarasoa, A. Roumy, C. Guillemot, Maximum likelihood BSC parameter estimation for the Slepian-Wolf problem, IEEE Comm. Lett. 15 (2011) 232–234. It's Lemma 1 on the second page.) But now I think about it, this is counterintuitive (to me) because if two rows of $A$ have $1$ at the same column, that means that I took at least one same $v_i$ when hashing $\boldsymbol{v}$ for those two rows, so the corresponding $v_i'$ look correlated.

If my reasoning and Alecos Papadopoulos's answer are correct, the last part of the above question in boldface becomes essential. For instance, are there standard methods for evaluating how close or similar a given set of random variables like $\boldsymbol{v}'$ is to i.i.d. Bernoulli?

Best Answer

1) Meta-issue:
I believe the OP should include in the title of the question a signal that something special is going on here - for example instead of "Are linear combinations of..." the title should read "Are $\mathbb{F}_2-$linear combinations of...", so that the reader gets the message that concepts may have special meanings in this specific question.

2) Bernoulli or not Bernoulli?
By $\mathbb{F}_2-$arithmetic, the sum of i.i.d Bernoulli rv's is not a Binomial, since all probabilities are condensed on the values $0$ and $1$. We first derive the binomial distribution and then add the probabilities of all even values of the support to the probability of $0$, and the probabilities of all odd values of the support to the probability of $1$ -and thus we obtain again a Bernoulli r.v. This is just to validate the probability mass function included in the OP's question. It is by construction a Bernoulli random variable, irrespective of how $P(v'_i = 1)$ is derived. Moreover if the number of $1$'s is the same in each row of the matrix $A$, then each $v'_i$ has an identical Bernoulli marginal distribution.

3) $\mathbb{F}_2-$ linear independence.
$\mathbb{F}_2-$ linear dependence does not look much like the "usual" concept of linear independence. To be able to obtain a square matrix of full row/column rank under $\mathbb{F}_2-$arithmetic, I conjecture that $w$, the number of $1$'s in each row, should be an odd number. Consider a square matrix $n\times n$ matrix $A$, and the first element of one of its rows, say $a_{i1}$. I reason as follows:
a) Assume $w$ is an even number.
a1) Assume $a_{i1} = 0$. Then in the remaining $n-1$ elements of the row, there exists an even number of $1$'s, which by $\mathbb{F}_2-$arithmetic will give us $0$. The rest of the elements of the row all also zero, so overall the sum of the $n-1$ elements will give $0$, i.e equal to the value of the first element.
a2) Assume $a_{i1} = 1$. Then in the remaining $n-1$ elements of the row, there exists an odd number of $1$'s, which by $\mathbb{F}_2-$arithmetic will give us $1$. The rest of the elements of the row are all zero, so overall the sum of the $n-1$ elements will give $1$, i.e again equal to the value of the first element.

So if $w=even$, the $\mathbb{F}_2-$sum of the $n-1$ columns will always equal the $n$-th column, depriving us of full rank.

b) Assume $w$ is an odd number.
b1) Assume $a_{i1} = 0$. Then in the remaining $n-1$ elements of the row, there exists an odd number of $1$'s, which by $\mathbb{F}_2-$arithmetic will give us $1$. The rest of the elements of the row all zero, so overall the sum of the $n-1$ elements will give $1$, i.e different to the value of the first element.
b2) Assume $a_{i1} = 1$. Then in the remaining $n-1$ elements of the row, there exists an even number of $1$'s, which by $\mathbb{F}_2-$arithmetic will give us $0$. The rest of the elements of the row all also zero, so overall the sum of the $n-1$ elements will give $0$, i.e again different than the value of the first element.

So it appears that $w= odd$, is at least a necessary condition to have a matrix $A$ of full column/row rank.

4) Stochastic (in)dependence in the $\mathbb{F}_2-$world.
Does the characteristics of $\mathbb{F}_2-$field affect the concept of stochastic (in)dependence? No. Two r.v.'s are independent if and only if their joint distribution is the product of their marginals. Meaning, the conditional distributions must equal the unconditional ones. Maybe the way operations work in the $\mathbb{F}_2-$field produces some unexpected results? Let's see: Assume that we have a square $A$ matrix that has $\mathbb{F}_2-$ linearly independent rows. The column vector process $\boldsymbol{v}'$, say of dimension $5\times 1$ is written

$$\boldsymbol{v}' = A\boldsymbol{v} = \left[\begin{matrix} v_1'(v_0,...)\\ v_2'(v_0,...)\\ v_3'(v_0,...)\\ v_4'(v_0,...)\\ v_5'(v_0,...)\\ \end{matrix}\right]$$

Now assume that $w=3$ and, say, that the $1$'s in $A$ as dispersed such that we have

$$ v_2' = v_0+v_1+v_5,\qquad v_4' = v_2+v_3+v_5$$

Consider the conditional probability (under $\mathbb{F}_2-$ arithmetic)

$$P_{\mathbb{F}_2}(v_2' =1\mid v_4'=0) = P_{\mathbb{F}_2}(v_0+v_1+v_5 =1\mid v_2+v_3+v_5=0)$$

If the conditioning statement is to affect the probabilities of $v_2'$, it will do so through $v_5$: the fact that $v_2+v_3+v_5=0$ must affect the probabilities related to $v_5$. Under the "usual" arithmetic this would be obvious: it would mean that $v_5$ should equal zero. What happens under $\mathbb{F}_2-$ arithmetic? We can examine $$P_{\mathbb{F}_2}(v_5 =1\mid v_2+v_3+v_5=0) = \frac{P_{\mathbb{F}_2}(\{v_5 =1\}\land \{v_2+v_3+v_5=0\})}{P_{\mathbb{F}_2}(v_2+v_3+v_5=0)}$$

The possible values of $v_2+v_3+v_5$ under $\mathbb{F}_2-$ arithmetic are enter image description here

from which we get the following contingency table

enter image description here

Therefore $$P_{\mathbb{F}_2}(v_5 =1\mid v_2+v_3+v_5=0) = \frac{2p^2(1-p)}{(1-p)^3+3p^2(1-p)}=\frac{2p^2}{1-2p+4p^2} \neq p$$ except when $p=1/2$ - but the set up explicitly specifies that $p<1/2$ . So we conclude that $$P_{\mathbb{F}_2}(v_5 =1\mid v_2+v_3+v_5=0) \neq P_{\mathbb{F}_2}(v_5 =1)$$

and so $$P_{\mathbb{F}_2}(v_2' =1\mid v_4'=0) \neq P_{\mathbb{F}_2}(v_2' =1)$$

I have shown that, in general, the elements of the vector process $\boldsymbol{v}'$ are not independent (although they have identical marginals), even under $\mathbb{F}_2$-arithmetic. I have found and read the lemma in the paper the OP mentions. There is no actual proof, just a verbal assertion that they are independent, due to the $\mathbb{F}_2$-linear independence of the rows of $A$. Hopefully, I am mistaken, but for the moment it appears the assertion is not valid.

5) Now what?
Not much. The joint distribution of the random vector will depend on how the $1$'s are allocated in matrix $A$. Without a specific form, the various measures of distance between distributions become vacuous. Intuitively, if $w$ is small relative to the length of the rows of $A$, then one can expect/hope, that the dependence will be relatively weak, and so pretending that the vector is i.i.d. won't hurt much... but without knowing the joint distribution, one cannot really tell... Copulas for discrete r.v.'s suffer from identification issues... I am still thinking about it, but I am not optimistic.

Related Question