Complete statistic for Normal Distribution $\mathcal{N}(\mu, \mu^2)$

normal distributionprobability distributionsstatistics

We call a "curved" normal if its distribution is $\mathcal{N}(\mu, \mu^2), \mu > 0$. Then a "curved" normal has pdf
$$
\left(\dfrac{1}{2\pi \mu^2}\right)^{\frac{1}{2}}e^{\frac{-1}{2\mu^2}(x – \mu)^2}
$$

Here we can rewrite this pdf as $e^{t(x)^T \eta(\mu) – \epsilon(\mu)}h(x)$ where $t(x) = (x, x^2), \eta(\mu) = \left(\dfrac{1}{\mu}, \dfrac{-1}{2\mu^2}\right), \epsilon(\mu) = \dfrac{1}{2}[1 + \ln(2\pi \mu^2)]$ and $h(x) = 1$. Hence, "curved" normal belongs to exponential distribution. This leads me to the conclusion that statistic
$$
T(\mathbf{X}) = \left(\displaystyle\sum_{i = 1}^{n} X_i, \displaystyle\sum_{i = 1}^{n} X_i^2\right)
$$

is complete sufficient statistic for parameter $\mu$, given $\mathbf{X} = (X_1, X_2, \cdots, X_n)$ is a random sample of size $n$ draw from this distribution

However, we have that
$$
\mathbb{E}\left[\dfrac{1}{n}\displaystyle\sum_{i = 1}^{n} X_i^2 – 2S_n^2\right] = (\mu^2 + \mu^2) – 2\mu^2 = 0
$$

where $S_n^2$ is sample variance. Hence, $T(\mathbf{X})$ cannot be complete statistic (contradict to previous statement)

So my question is what is wrong with my logic ? Any help is appreciated, thanks!

Best Answer

A curved exponential family is not the same as a regular or full-rank exponential family where the natural parameter space is assumed to contain an open subset of $\mathbb R^p$ (for some positive integer $p$). The completeness of sufficient statistic in an exponential family actually depends on this open set condition.

In the $\left\{N(\mu,\mu^2):\mu \in \Omega\right\}$ family of distributions where $\Omega=\mathbb R \setminus \{0\}$, the natural parameter as you have found is of the form $\eta(\mu)=\left(\frac1\mu,\frac1{2\mu^2}\right)$. The natural parameter space is therefore $$\tilde\eta(\Omega)=\{\eta(\mu):\mu \in \Omega\}=\{(x,y):y=x^2 ,\,x\in \mathbb R,\,y>0\}$$

There is no open subset of $\mathbb R^2$ contained in $\tilde\eta(\Omega)$. So the $N(\mu,\mu^2)$ family does not belong to a regular two-dimensional exponential family. In particular, this is called a curved exponential family because $\tilde\eta$ is a curve (a parabola to be precise) and points in $\tilde\eta$ do not satisfy a linear constraint. Observe that this is a two-dimensional exponential family with a one-dimensional parameter.

In curved exponential families, the sufficient statistic found using Factorization theorem is typically minimal sufficient, but it may not be complete. In the $N(\mu,\mu^2)$ model, the minimal sufficient statistic $T=(\sum X_i,\sum X_i^2)$ is not complete as you have shown through a counterexample.

Relevant posts: