So I read somewhere than if you have $N$ numbers picked independently from a uniform distribution, say $[0,1]$, the greatest number has an expected value of $\frac{N}{N+1}$. So if you have 2 numbers the greatest has expected value $2/3$. The smallest has expected value of $1/3$. The expected values are uniformly distributed. This makes sense, but is there a clear/intuitive proof of this? Thanks 🙂
[Math] The largest of $N$ random numbers over a uniform distribution
probabilitystatistics
Related Solutions
If $X$ is a (vector) random variable that takes on each of the $2^n$ values $$\{00\cdots 00, \quad 00\cdots 01,\quad 00\cdots 10,\quad \ldots\ , \quad 11\cdots 10,\quad 11\cdots 11\}$$ with equal probability $2^{-n}$, then for any fixed $n$-bit vector $m$, $Z = X\oplus m$ also has the same distribution since for all choices of $n$-bit vector $a$ we have that $$P\{Z = a\} = P\{m\oplus X = a\} =P\{X = m\oplus a\} = 2^{-n}\tag{1}$$ But suppose that $Y$ is another random variable taking on the same $2^n$ values with arbitrary distribution (including the possibility that $Y$ takes on some values with probability $0$) and suppose that $Z = X \oplus Y$. Given that the event $\{Y = m\}$ occurs, suppose that the conditional distribution of $X$ conditioned on $\{Y = m\}$ is uniform on the $2^n$ values of $X$. Then, we have that $$P\{Z = a\mid Y = m\} = P\{X\oplus Y = a\mid Y = m\} = P\{X = m\oplus a \mid Y = m\} = 2^{-n}. \tag{2}$$ Equation $(2)$ shows that $P\{Z = a \mid Y = m\}$, the conditional probability that $Z = X\oplus Y$ equals $a$ given that $Y = m$ is $2^{-n}$. If $(2)$ holds for all $m$, then since the unconditional probability that $Z = a$ is, by the law of total probability, a weighted sum of the conditional probabilities, we also have that $$P\{Z = a\} = \sum_m P\{Z=a\mid Y = m\}P\{Y = m\} = 2^{-n}\sum_m P\{Y = m\} = 2^{-n}.$$ Note that we have implicitly assumed that $X$ and $Y$ are independent because we have assumed that for all choices of $m$, the conditional distribution of $X$ given that $\{Y = m\}$ is a uniform distribution on the $2^n$ values of $X$, that is the conditional distribution of $X$ is the same regardless of the choice of the value of $Y$.
Two points are worth noting here.
It is not necessary that $Y$ also be uniformly distributed on the $2^n$ values as long as $X$ and $Y$ are assumed to be independent. In fact, as $(1)$ indicates, $Z = X\oplus Y$ is uniformly distributed on the $2^n$ even if $Y$ is a degenerate random variable taking on value $m$ with probability $1$. (Note that even in this case, $X$ and $Y$ are independent random variables nonetheless).
While $X$ and $Y$ being independent is sufficient for $Z$ to inherit the uniform distribution on the $2^n$ values that $X$ enjoys, independence of $X$ and $Y$ is not necessary either. For example, with $n=2$ and $X$ uniformly distributed on $\{00, 01, 10, 11\}$, consider the joint distribution given below: $$P\{X = 00, Y = 00\} = \frac{1}{4}, \quad P\{X = 11, Y = 00\}= \frac{1}{4},\\ P\{X = 01, Y = 11\} = \frac{1}{4}, \quad P\{X = 10, Y = 11\}= \frac{1}{4},$$ where $X$ and $Y$ are clearly not independent, $Y$ is not uniformly distributed on $\{00, 01, 10, 11\}$ (though it is uniformly distributed on $\{00, 11\}$), and yet $Z$ is uniformly distributed on $\{00, 01, 10, 11\}$ just as $X$ is.
You have to generate 4 32-bit random numbers and then merge them together. Adding will definitely a bad idea, you won't get a uniform distribution but merging in sense of "1th .. 32th bit - 1st generated number, 33th .. 64th bit - 2nd generated number and so on" will give you uniformly distributed binary number of length 128, as probability of event that any bit equals zero $=$ probability that bit equals one $= 1/2$. Numerically you have to multiply the first number at $2^{96}$, the second at $2^{64}$, the third at $2^{32}$, take the fourth as is and then sum them up.
Best Answer
Let $U_1 , ... , U_N$ be the $N$ random variables from the uniform distribution. Let $X$ be their maximum. We will in fact compute the distribution of $X$, that is
$$F(t):= \mathbb P(X < t) .$$
The maximum of $N$ numbers is less than $t$ if and only if all $N$ of them are less than $t$. Therefore
$$\mathbb P(X < t) = \mathbb P(U_1 < t , ... , U_N < t).$$
Since the $U_j$'s are independent, this is the same as
$$\mathbb P(U_1 < t) ... \mathbb P(U_N < t).$$
If $t \in [0,1]$, then since $U_j$'s are uniformly distributed we get
$$F(t) = t...t = t^N.$$
Hence the density of $X$ is $F'(t) = N t^{N-1}$ on $[0,1]$. So, the mean of $X$ is
$$\int_0^1 t\cdot Nt^{n-1} dt = \frac{N}{N+1}.$$
EDIT: The general case of the expectation of the $k$th largest of $N$ random variables follows from a similar argument. Here one uses that if $Y$ is the $k$th largest of $N$ independent random variables, then $Y$ is less than $t$ if and only if $k$ of the random variables are less than $t$ and $N-k$ are greater than or equal to $t$ (and there are $N$ choose $k$ ways for this to happen).