Uniform Distribution – Understanding Sufficient Statistics for Uniform $(-\theta,\theta)$

inferenceself-studysufficient-statisticsuniform distribution

So, I know that $\max(-X_{(1)},X_{(n)})$ is a sufficient statistic for the parameter $\theta$. But can I also say that $(X_{(1)},X_{(n)})$ are jointly sufficient for the parameter $\theta$ ?

In other words, can a single parameter have jointly sufficient statistics?

Best Answer

Suppose we have a random sample $(X_1,X_2,\cdots,X_n)$ drawn from $\mathcal U(-\theta,\theta)$ distribution.

PDF of $X\sim\mathcal U(-\theta,\theta)$ is $$f(x;\theta)=\frac{1}{2\theta}\mathbf1_{-\theta<x<\theta},\quad\theta>0$$

Joint density of $(X_1,X_2,\cdots,X_n)$ is

\begin{align}f_{\theta}(x_1,x_2,\cdots,x_n)&=\prod_{i=1}^nf(x_i;\theta) \\&=\frac{1}{(2\theta)^n}\mathbf1_{-\theta<x_1,\cdots,x_n<\theta} \\&=\frac{1}{(2\theta)^n}\mathbf1_{0<|x_1|,\cdots,|x_n|<\theta} \\&=\frac{1}{(2\theta)^n}\mathbf1_{\max_{1\le i\le n}|x_i|<\theta} \end{align}

To clearly use the factorisation theorem, let us define $\mathbb I(x)=\begin{cases}1&,\text{ if }x>0\\0&,\text{ otherwise }\end{cases}$

Then we have

\begin{align} f_{\theta}(x_1,x_2,\cdots,x_n)&=\frac{1}{(2\theta)^n}\mathbb I\left(\theta-\max_{1\le i\le n}|x_i|\right) \\&=g\left(\theta,\max_{1\le i\le n}|x_i|\right)h(x_1,x_2,\cdots,x_n) \end{align}

where $g\left(\theta,\max_{1\le i\le n}|x_i|\right)=\frac{1}{(2\theta)^n}\mathbb I\left(\theta-\max_{1\le i\le n}|x_i|\right)$ depends on $\theta$ and on $x_1,x_2,\cdots,x_n$ through $\max_{1\le i\le n}|x_i|$, and $h(x_1,x_2,\cdots,x_n)=1$ is independent of $\theta$.

So by factorisation theorem, $\max_{1\le i\le n}|X_i|$ is a sufficient statistic for $\theta$.

In fact, it can be shown to be minimal sufficient for $\theta$.

Since $-\theta<x_i<\theta\implies \theta>\max(-x_{(1)},x_{(n)})$, you are correct that $\max(-X_{(1)},X_{(n)})$ is a sufficient statistic for $\theta$ by a similar logic. In fact, if I am not wrong, it can be shown that $$\max(-X_{(1)},X_{(n)})=\max_{1\le i\le n}|X_i|$$

It is perfectly valid for a single unknown parameter to have jointly sufficient statistics since the definition of sufficiency tells us that a set of statistics $T_1(X_1,\cdots,X_n),\cdots,T_k(X_1,\cdots,X_n)$ is jointly sufficient for $\theta$ (which may be a vector) if and only if the conditional distribution of $X_1,\cdots,X_n$ given $T_1,\cdots,T_k$ does not depend on $\theta$.

As for example, we always have two trivial choices of jointly sufficient statistics. One is the sample $(X_1,X_2,\cdots,X_n)$ itself and the other is the set of order statistics $(X_{(1)},X_{(2)},\cdots,X_{(n)})$ as mentioned by knrumsey in the comments.

Again, $(X_{(1)},X_{(n)})$ is also sufficient for $\theta$ as it has the two components $X_{(1)}$ and $X_{(n)}$ of the minimal sufficient statistic, but it is not itself minimal sufficient as mentioned in the comments.


To clarify further queries of OP in the comments:

If a statistic $T$ is sufficient for a one-dimensional parameter $\theta$, then $\mathbf T=(T,T_0)$ will also be a sufficient statistic for $\theta$ for any other statistic $T_0$. This is because $\mathbf T$ can be looked as a one-to-one function of $T$. In particular, if $T$ is minimal sufficient, then $\mathbf T$ will be sufficient in a sense that the second component $T_0$ in $\mathbf T$ will be of no use in further data condensation and data reduction without loosing any information about $\theta$. So $T$ will be preferred to $(T,T_0)$ as a sufficient statistic.

For example, consider the $\mathcal N(\theta,1)$ population. Here the sample mean $\bar X$ is minimal sufficient for $\theta$. Now if we take $\mathbf T=(\bar X,S^2)$ where $S^2$ is the sample variance, then $\mathbf T$ will remain sufficient but no longer minimal sufficient.

Thanks to @knrumsey for pointing out my error.

Related Question