Connection of the likelihood function and sufficient statistics

statistical-inferencestatistics

Casella Berger has the following statement:

A function $T(x)$ of the sample is specified and the principle
(?sufficiency?) states that if $x$ and $y$ are two sample points with
$T(x) = T(y)$, then the same inference about $\theta$ should be made
whether $x$ or $y$ is observed. The function $T(x)$ is a sufficient
statistic when the Sufficiency Principle is used. The "value" of
$T(x)$ is the set of all likelihood functions proportional to
$L(\theta|x)$ if the Likelihood principle is used.

I have a hard time digesting this.

  1. Why does C&B refer to some random function and says that if the values are equal, we have to assume the inference about $\theta$ is the same. Then C&B mentions that "it would be called a sufficient statistic". Isn't it visa versa? In C&B the Sufficiency principle actually requires that $T(x)$ is a sufficient statistic, and only then the evidence equality for $x, y$ follows. What is wrong here?

  2. What does C&B mean by The "value" of $T(x)$ is the set of all likelihood functions proportional to $L(\theta|x)$ if the Likelihood principle is used. ? What do likelihood functions have to do with a sufficient statistic $T(x)$, and how are they proportional? Also, why are we talking about "functions" when the likelihood for a particular parameter $\theta$ is actually a value.

Best Answer

They are trying to clarify that the sufficiency principle and the likelihood principle are similarly structured. They state a general template for a principle as follows. (I will use $S$ instead of $T$ so as not to conflate with the notation used when stating the sufficiency principle.)

General principle based on function $S$. If $x$ and $y$ are two sample points with $S(x)=S(y)$, then the same inference about $\theta$ should be made whether $x$ or $y$ is observed.

For different choices of $S$ you will get different principles. The authors explain how the sufficiency principle and the likelihood principles are special cases of the above general principle for certain choices of $S$.

  • If $S:=T$ where $T$ is a sufficient statistic, then the above principle becomes the sufficiency principle.
  • If $S$ maps $x$ to the set $\{c L(\theta \mid x) : c > 0\}$, then the above principle becomes the likelihood principle. (Try to convince yourself that with this definition, $S(x)=S(y)$ means that $L(\theta \mid x)$ and $L(\theta \mid y)$ are proportional.)
Related Question