Solved – Constructing a continuous distribution to match $m$ moments

approximationdistributionsmoments

Suppose I have a large sample drawn from a continuous distribution, size $n$, and $2 < m\ll n$ moments from that sample. Alternatively, suppose I have been given those moments by an angel, oracle, theory or fiat. Is there a named distribution, or a natural method of constructing distributions, that lets you exactly match m moments for an arbitrary m?

I know that a finite number of moments will not uniquely determine a distribution. To narrow it down a little bit, let's limit it to distributions that are differentiable (and I presume analytic) and that simplify to the normal distribution for functions with arbitrary first and second moments and skewness, excess kurtosis and higher moments = 0.

I would also strongly prefer a distribution that can be strictly positive for some set of moments and/or coefficients of the distribution function.

Instead of the usual central moments, I would accept a solution posed in terms of raw moments, or normalized moments, or cumulants if that makes the problem easier or produces a more interesting result (with the skewness and kurtosis analogs chosen in an analogous manner to those above).

It would also be a nice property, though I do not know if it is even possible, if the distribution for m equal to some specific value m* could be reduced to the standard normal distribution by some explicitly statable invertible function of $x$ and the moments — opening the possibility of some kind of higher-order analogue of the t-statistic, or some other useful normalization.

Please note that I am not asking how to fit a predetermined distribution, nor how to select a distribution from some larger family. I am looking for the most simple and natural ways to fit a set of moments or moment-like entities exactly, like fitting an $n$th degree polynomial to $n+1$ points.

Best Answer

I do not know about any exact matching technique. If an approximated technique could work for you (meaning getting a density which approximately matches m given moments), then you could consider using an orthogonal polynomial series approach. You would choose a polynomial basis (Laguerre, Hermite, etc) depending on the range of your data. I describe below a technique that I have used in Arbel et al. (1) for a compactly supported distribution (see details in Section 3).

In order to set the notation, let us consider a generic continuous random variable $X$ on $[0,1]$, and denote by $f$ its density (to be approximated), and its raw moments by $\gamma_r=\mathbb{E}\big[X^r\big]$, with $r\in\mathbb{N}$. Denote the basis of Jacobi polynomials by $$G_i(s) = \sum_{r=0}^i G_{i,r}s^r,\,i\geq 1.$$ Such polynomials are orthogonal with respect to the $L^2$-product $$\langle F,G \rangle=\int_0^1 F(s) G(s) w_{a,b}(s)d s,$$ where $w_{a,b}(s)=s^{a-1}(1-s)^{b-1}$ is named the weight function of the basis and is proportional to a beta density in the case of Jacobi polynomials. Any univariate density $f$ supported on $[0,1]$ can be uniquely decomposed on such a basis and therefore there is a unique sequence of real numbers $(\lambda_i)_{i \geq 0}$ such that $$f(s)=w_{a,b}(s)\sum_{i=0}^\infty \lambda_i G_i(s).$$ From the evaluation of $\int_0^1 f(s)\, G_i(s)\,d s$ it follows that each $\lambda_i$ coincides with a linear combination of the first $i$ moments of $S$, specifically $\lambda_i=\sum_{r=0}^i G_{i,r}\gamma_r$. Then, truncate the representation of $f$ in the Jacobi basis at a given level $N$, providing the approximation $$f_N(s)=w_{a,b}(s)\sum_{i=0}^N \left(\sum_{r=0}^i G_{i,r}\mu_r\right) G_i(s).$$ That polynomial approximation is not necessarily a density as it might fail to be positive or to integrate to 1. In order to overcome this problem, one can consider the density $\pi$ proportional to its positive part, thus defined by $\pi(s)\propto\max(f_N(s),0)$. If sampling from $\pi$ is needed, one can resort to the rejection sampler, see for instance Robert & Casella (2).

There is a companion R package called momentify that provides the approximated distribution given moments, and allows to sample from it, available at this link: https://www.researchgate.net/publication/287608666_momentify_R_package, and discussed at this blog post.

Below are two examples with increasing the number of moments involved. Note that the fit is much better for the unimodal density than the multimodal one.

animated gif, approximation with increasing number of matching moments

animated gif, approximation with increasing number of matching moments

References

(1) Julyan Arbel, Antonio Lijoi, and Bernardo Nipoti. Full Bayesian inference with hazard mixture models. Computational Statistics & Data Analysis 93 (2016): 359-372. arXiv link, journal link.

(2) Christian Robert, and George Casella. "Monte Carlo Statistical Methods Springer-Verlag." New York (2004).

Related Question