[Math] Joint and Marginal distributions of a random sample

probability distributions

Let $X_{1},X_{2},\ldots ,X_{n}$ be a random sample of size $n$ from a population distribution $F$. I want to find the following:
1. the joint P.d.f of $X_{1},X_{2},\ldots ,X_{n}$.
2. the marginal probability distribution of $X_{j}$ for any $j$ in $1\leq j \leq n$.


This is my attempt for 1.
Given $F$, the joint pdf of the random sample is
$$\begin{align*}
F\left(X_{1},X_{2},\ldots ,X_{n}\right) & =P \left(X_{1}\leq x_1, X_{2}\leq x_2, \ldots X_n\leq x_n \right)\\
&=P(X_1 \leq x_1)P(X_2\leq x_2)\dot{} \ldots \dot{}P(X_n\leq x_n) \\
&=\left[P(X_1\leq x_1)\right ]^n \qquad \because X_j\text{'s are identical}
\end{align*} $$


Here are my questions: First, iIs my attempt for 1. right. Is there a better way of doing it.Second, I would like some help with 2. I know the $X_j$'s will all have the same marginal distributions, but I don't know how to justify it.

Thanks.

Best Answer

We will assume that the $X_j$ are independent. This assumption is not automatically built into the definition of random sampling, but it is necessary if we are to give a complete answer.

If $F(x)$ is the cumulative distribution function for the population, and $F_n(x_1, x_2,\dots, x_n)$ is the joint sample (cumulative) distribution function, then, more or less as you wrote, we have $$F_n(x_1,x_2,\dots,x_n)=P(X_1 \le x_1)(P(X_2\le x_2)\cdots P(X_n \le x_n).$$ Please note that $F_n$ is a function of the $n$ real variables $x_1, x_2, \dots,x_n$. (No caps!) The reasoning that you used to get to this stage was correct.

We can therefore write $$F_n(x_1,x_2,\dots,x_n)=F(x_1)F(x_2)\cdots F(x_n).$$

The final displayed expression in the post, namely $[P(X_1 \le x_1)]^n$, is not correct, and cannot be correct, for it does not mention the variables $x_2$ to $x_n$. In the form $[F(x)]^n$, it does occur in the calculation of the distribution of the largest sample value, but that is not the problem you were asked to look at.

If we want the joint density function $f(x_1,x_2,\dots,x_n)$, we just multiply the individual density functions $f(x_j)$.

The (marginal) distribution of any $X_j$ is, by independence, the same as the population distribution. So if you want to specify the distribution by using a cdf, the answer would be simply $F(x_j)$. If we are in a continuous situation, and $F'(x)=f(x)$, then the (marginal) density function of $X_j$ is $f(x_j)$.

Related Question