[Math] Degree of the minimal sufficient statistic for $\theta$ in $U(\theta-1,\theta+1)$ distribution

order-statisticsstatisticssufficient-statisticsuniform distribution

Suppose $X_1,X_2,…,X_n$ is a random sample from the Uniform distribution over the interval $(\theta-1,\theta+1)$. By the factorization theorem, it is clear that the order statistics $Y_1=X_\left(1\right)$ and $Y_n=X_\left(n\right)$ are joint sufficient statistics for $\theta$. Now, because $$\theta-1 < Y_1 < Y_n < \theta+1 $$ implies that $Y_n -1 < \theta < Y_1 + 1$, choosing $\hat\theta \in (Y_n-1,Y_1+1)$ will force the likelihood function to achieve its maximum; that is, $$L(\hat\theta)=\left(\frac{1}{2}\right)^n, \forall \hat\theta\in(Y_n-1,Y_1+1)$$

Typically, statisticians takes the value for $\hat\theta$ to be the midrange of $Y_1,Y_n$: $\hat\theta=\frac{Y_1+Y_n}{2}$.

Now, because such a value for $\hat\theta$ maximizes the likelihood function, it follows that $\hat\theta$ is an mle for $\theta$ of in $Unif(\theta-1,\theta+1)$. It is noted in the Hogg, McKean, Craig textbook "Introduction to Mathematical Statistics" ($7^{th}$ edition), however, that even though $\hat\theta$ is an mle of $\theta$ and a function of the joint sufficient statistics $Y_1,Y_n$ for $\theta$, it does not represent a minimal sufficient statistic; this is because $\hat\theta$ is itself not a sufficient statistic for theta.

What I'm wondering here is $why$ $\hat\theta = \frac{Y_1+Y_n}{2}$ is not a sufficient statistic for $\theta$. By the definition of a sufficient statistic and using the factorization theorem, $Y_1,Y_n$ being the joint sufficient statistics for $\theta$ imply that the likelihood function can be written as $$\prod_1^nf(x_i;\theta) = K_1(Y_1,Y_n;\theta)*K_2(x_1,x_2,…,x_n)$$ where $K_1(Y_1,Y_2;\theta)$ depends on $x_1,…x_n$ only through $Y_1,Y_n$ and$K_2(x_1,x_2,…,x_n)$ does not depend upon $\theta$; in our case, $$K_1(Y_1,Y_n;\theta)=\left(\frac1{2}\right)^n \cdot \mathbf 1_{(\theta-1,\theta+1)}(Y_1) \cdot \mathbf 1_{(\theta-1,\theta+1)}(Y_n) $$

Now, although it is clear that we cannot reduce the product of the two indicator functions down any further to any single order statistic $Y_i$ such that an equality remains between $K_1$ and our new function, why is it not the case that $$\mathbf 1_{(\theta-1,\theta+1)}(\frac{Y_1+Y_2}{2}) \neq \mathbf 1_{(\theta-1,\theta+1)}(Y_1) \cdot \mathbf 1_{(\theta-1,\theta+1)}(Y_n)$$ is a valid equality?

In proceeding further in my reasoning, let's reference this question. For an overview, this question revolves around the sufficient statistics for the uniform distribution $Unif(-\theta,\theta)$. In this problem, one can reduce their argument from the joint sufficient statistics $X_\left(1\right),X_\left(n\right)$ down to the single sufficient statistic $max\{X_\left(1\right),X_\left(n\right)\}=Y^*$.

Does it happen to be so that the only thing stopping us from reducing our two-dimensional joint sufficient statistic(s), $\mathbf Y=(Y_1,Y_n)$, for $\theta$ down into the single-dimensional ("minimal") sufficient statistic, $Y=Y^*$, for $\theta$ the fact that our Uniform distribution is not distributed symmetrically about the Y-axis? I could see this argument being the case, but I'd like to be sufficiently (no pun intended) sure of that before proceeding on in my studies. It took me quite a bit of investigating to begin to understand how to work through various operations when working with the indicator function, so I still might a little lost when it comes to the finer intricacies of manipulating this function.

Best Answer

Now I hate to be the one to answer my own question, but I feel that in the time it took me to formulate my question in MathJax, I might have arrived at the answer.

First, let's look at why the reduction of degree from two-dimensions to one-dimension for a (joint) sufficient statistic vector for $\theta$ of the Uniform distribution works for symmetrical arguments:

Suppose $X_1,X_2,...,X_n$ is a random sample from the symmetric Uniform distribution $Unif(-\theta,\theta)$. By the factorization theorem, it is easy to verify that the vector $\mathbf Y = (Y_1,Y_2)$ where $Y_1 = X_\left(1\right)$ and $Y_2=X_\left(n\right)$ is a joint sufficient vector of degree two for $\theta$, with $$K_1(Y_1,Y_2;\theta)=(\frac{1}{2})^n \cdot \mathbf 1_{(-\theta,\theta)}(Y_1) \cdot \mathbf 1_{(-\theta,\theta)}(Y_n)$$

From the two indicator functions and from the definition of order statistics, we have that $$-\theta<Y_1<Y_n<\theta \implies \theta>-Y_1 \land \theta>Y_n$$

This allows us to use the maximum function concurrently on $-Y_1$ and $Y_n$ to put a restriction on $\theta$, meaning that this result, $Y^* = max\{-Y_1,Y_n\}$, is such that $$\mathbf 1_{(-\theta,\theta)}(Y_1) \cdot \mathbf 1_{(-\theta,\theta)}(Y_n) = \mathbf 1_{(-\theta,\theta)}(Y^*)$$ is a valid equality.

On the other hand, suppose $X_1,X_2,...,X_n$ is a random sample from the Uniform distribution $Unif(\theta-1,\theta+1)$. By the factorization theorem, it is easy to verify that the vector $\mathbf Y = (Y_1,Y_2)$ where $Y_1 = X_\left(1\right)$ and $Y_2=X_\left(n\right)$ is a joint sufficient vector of degree two for $\theta$, with $$K_1(Y_1,Y_2;\theta)=(\frac{1}{2})^n \cdot \mathbf 1_{(\theta-1,\theta+1)}(Y_1) \cdot \mathbf 1_{(\theta-1,\theta+1)}(Y_n)$$

From the two indicator functions and from the definition of order statistics, we have that $$\theta-1<Y_1<Y_n<\theta+1 \implies Y_1+1>\theta \land Y_n-1<\theta$$

Because now we have $\theta$ sandwiched between two restrictions ("variables", for our purposes) and without the benefit of appealing to the symmetry of the situation, we have no tools available to ourselves to condense the information provided from $Y_1$ and $Y_2$ any further. Thus, we must concede that the joint sufficient statistics $Y_1$ and $Y_n$ are joint minimal sufficient statistics for $\theta$ for a non-symmetric Uniform distribution. On the other hand, we have also shown that $Y^*=max\{Y_1,Y_n\}$ is the single-dimensional and (thus) minimal sufficient sufficient statistic for $\theta$ for a symmetric Uniform distribution.