Given a randomly generated binary matrix with fixed row and column weights, what is the probability that columns have ones at different rows

combinatoricsprobability

Setup: Given $a,b\in\mathbb{N}$, and $b\geq a$ such that $b/a\in\mathbb{N}$, I generate (i.e., uniformly sample amongst all possible matrices) a random matrix $\mathbf{A}\in\{0,1\}^{a,b}$, where $a$ is the number of rows and $b$ is the number of columns, such that each column of $\mathbf{A}$ contains exactly one entry 1 (i.e., weight of one), and each row of $\mathbf{A}$ contains exactly $b/a$ entries of 1 (i.e., weight of $b/a$). This implies that any individual column is uniformly distributed among all length-$a$ columns of weight one (in total there are only $a$ such columns), and any individual row is uniformly distributed among all length-$b$ columns of weight $b/a$.

Question: Assuming $x<a$ and $x<b$, looking at just $x$ specific columns (e.g., the first $x$ columns), what is the probability that all $x$ columns have a one in a different row?

Attempt: I think of counting the number of such matrices. More specifically I count the number of ways to choose the rows of the matrix sequentially. This gives
\begin{align}
\underbrace{{b-x\choose b/a-1}\dots{b-x-(x-1)(b/a-1)\choose b/a-1}}
_{\text{#ways to choose rows with entry 1 in the $x$ columns}}
\underbrace{{b-x(b/a)\choose b/a}\dots{b-(a-1)b/a\choose b/a}}
_{\text{#ways to choose remaining rows}}
\end{align}

and the number of ways to choose the matrix without the imposed restrictions is
\begin{align}
{b\choose b/a}\dots{b-(a-1)b/a\choose b/a}.
\end{align}

Dividing the final terms of the two equations give us our probability. Did I under/over-count the number of possibilities? Do I also need to include the number of rows each of the $x$ columns can have their one entry in? This would result in a further multiplication of $a(a-1)\dots(a-x+1)$.

Best Answer

Out of the $b$ columns, you have $a$ sets of columns with each set having $b/a$ elements each with $1$ in the same place. Goal is to choose $x$ columns, each from a different set.

For $x > a$, the answer is $0$, and henceforth, we assume $x \leq a$. If $x=1$, the answer is $1$. If $x=2$, we just need to ensure that the second picked column is not from the set of the first, that is, it is from the $b - b/a$ columns. The probability of the latter is $(b - b/a)/(b - 1)$

Extending this argument to $x$, we have the probability $p(x)$

$$p(x) = \frac{(b - 0b/a)}{b}\frac{(b - b/a)}{b - 1}\frac{(b - 2b/a)}{b - 2}...\frac{(b - (x - 1)b/a)}{b - x + 1}$$ $$p(x) = \frac{(b - x)!\prod\limits_{k = 0}^{x - 1}\left(b - \frac{kb}{a}\right)}{b!}$$

After some rearrangement

$$p(x) = \frac{(b - x)!(b/a)^{x}a!}{(a - x)!b!}$$

Alternatively, the last equation can be reached at by considering the total number of column permutations that we can select to be $\frac{b!}{(b - x)!}$, and to count the specific permutations we want, we first count the number of ways to permute $x$ items from $a$ different sets $\left(\frac{a!}{(a-x)!}\right)$. In each of these permutations, for each of the elements of $x$, we have $b/a$ choices since there are $b/a$ columns in every such set. The latter gives us the factor of $(b/a)^x$.

Related Question