[Math] Suppose you have ‘n’ identically distributed, independent random variables, what is the pdf of the max(),min() of those variables

probability distributions

Let $X_1, X_2, …,X_n$ denote independent and identically distributed (iid) random variables (r.v) each with pdf $f_x(x)$. The (homework) problem asks me to consider the function:

Y = min{$X_1,X_2,…,X_n$}

And the goal is to find the PDF of Y.

Since the min{} function returns one of the random variables, isn't the PDF just $f_x(x)$? (He drops the hint that we should find the CDF of of Y and then take the derivative, which is rather involved, ergo I don't think I understand how min{} affects PDF. Can someone tell me more about how min{} may affect the r.v's?

Best Answer

Suppose that the random variables are iid uniform over the interval $[0,1]$. The pdf is a flat line because each of the $X_{i}$ are equally likely to be anywhere between $[0,1]$. However, if you collect $n$ of them and look at the minimum, it is more likely to, for example, be towards the lower half of the interval than the upper half so it's pdf should have more area towards $0$ and less up by $1$. i.e. It won't be a flat line anymore!

We can represent this problem using the following experiment:

Draw a number line and put $y$ on it. $y$ can be any value but thinking about the range of the function $min(X_{1},...,X_{n})$, $y$ should sensibly be a positive real number in the interval [0,1]. Now imagine the different number of ways you can drop $n$ data points around $y$ so that the minimum value of all points (when considered together) is below $y$. There are a lot of ways to do this but you must keep in mind that each point must be located between the interval [0,1] since the probability of them appearing elsewhere outside of the closed interval [0,1] is 0.

Your instructor is giving you a good hint. It is almost always easiest to use cdfs when trying to find the distribution of mins and maxes. Note that

$$ F_{Y(y)} = P(Y \leq y) = P(\min (X_{1}, X_{2}, \ldots, X_{n}) \leq y). $$

The idea is to relate this to probabilities involving the individual $X_{i}$ since you know things about them.

Now consider writing: $$ P(\min (X_{1}, X_{2}, \ldots, X_{n}) \leq y) = 1 - P(\min (X_{1}, X_{2}, \ldots, X_{n}) > y). $$

The only way to arrange your points so that $\min (X_{1}, X_{2}, \ldots, X_{n}) > y$ is to put them all above $y$.

So, $$ \begin{array}{lcl} F_{Y(y)} &=& P(Y \leq y) = P(\min (X_{1}, X_{2}, \ldots, X_{n}) \leq y)\\ &=& 1-P(\min (X_{1}, X_{2}, \ldots, X_{n}) > y)\\ &=& 1-P(X_{1}>y, X_{2}>y, \ldots, X_{n}>y) \end{array} $$

Now use the fact that the $X_{i}$ are independent to break that probability up into a product. Use the fact that they are identically distributed in order to write it as one probability to the $n$th power.

You are almost there since you can relate that one probability to the cdf of your original distribution!

Related Question