Solved – Sufficient order statistics

mathematical-statisticsorder-statisticsself-studysufficient-statistics

I'm having trouble understanding example 6.2.5 of Casella and Berger:

Let $X_1,\ldots, X_n$ be iid from a pdf $f$, where we are unable to specify any more information about the pdf. It then follows that the sample density is given by
$$f(\mathbf{x}) = \prod_{i=1}^n f(x_i) = \prod_{i=1}^n f(x_{(i)}), $$
where $x_{(1)} \leq \ldots\leq x_{(n)}$ are the order statistics. By Theorem 6.2.2, we can show that the order statistics are a sufficient statistic.

So the identity for the density should hold when $X_1, X_2 \sim Unif(0,1)$ are iid, but then $f(\mathbf{x}) = f(x_1)\cdot f(x_2) = 1$
while
$$\prod_{i=1}^n f(x_{(i)}) = f(x_{(1)})\cdot f(x_{(2)}) = 2(1-x_{(1)})\cdot 2x_{(2)} $$
since $X_{(j)} \sim Beta(j, n-j+1)$.

Is there an error in the book or am I misunderstanding something?

Best Answer

Your confusion stems from the loose notations I presume: when writing $$f(\mathbf{x}) = \prod_{i=1}^n f(x_i) = \prod_{i=1}^n f(x_{(i)}),$$ George Casella and Roger Berger first use $f$ for the density of the vector, $f(\mathbf{x})$, then second as the density of the components, $f(x_i)$, and then again as the density of the components, $f(x_i)$, rather than as the densities of the order statistics. Distinguishing between all three as in \begin{align*}\mathbf X &\sim f_{1:n}(\mathbf x) \\ X_i &\sim f_1(x_i)\quad i=1,\ldots,n\\ X_{(i)} &\sim f_{(i)}(x_{(i)}) \end{align*} the identity looks like $$\underbrace{f_{1:n}(\mathbf{x})}_{\text{joint}} = \prod_{i=1}^n \underbrace{f_1(x_i)}_{\text{individual}} = \overbrace{\prod_{i=1}^n \underbrace{f_1(x_{(i)})}_{\text{individual}}}^{\text{reordering}\\\text{by rank}} \ne \overbrace{\prod_{i=1}^n\underbrace{f_{(i)}(x_{(i)})}_{\text{ordered}}}^{\text{incorrect due}\\\text{to dependence}},$$

Related Question