For completion's sake (and to help improve the stats of this site, ha) I have to wonder if this paper wouldn't also answer your question?
ABSTRACT:
We discuss the choice of polynomial basis for approximation of uncertainty propagation through complex simulation models with capability to output derivative information. Our work is part of a larger research effort in uncertainty quantification using sampling methods augmented with derivative information. The approach has new challenges compared with standard polynomial regression. In particular, we show that a tensor product multivariate orthogonal polynomial basis of an arbitrary degree may no longer be constructed. We provide sufficient conditions for an orthonormal set of this type to exist, a basis for the space it spans. We demonstrate the benefits of the basis in the propagation of material uncertainties through a simplified model of heat transport in a nuclear reactor core. Compared with the tensor product Hermite polynomial basis, the orthogonal basis results in a better numerical conditioning of the regression procedure, a modest improvement in approximation error when basis polynomials are chosen a priori, and a significant improvement when basis polynomials are chosen adaptively, using a stepwise fitting procedure.
Otherwise, the tensor-product basis of one-dimensional polynomials is not only the appropriate technique, but also the only one I can find for this.
There's no optimal polynomial basis for the log-normal distribution, because it's not determinate in the Hamburger sense. The approach indicated by Xiu and Karniadakis (Generalized Polynomial Chaos) doesn't always work. I'll be a bit informal here. For a rigorous treatement, see here.
Let $X$ denote a continuous random variable and $f_X(x)$ its pdf. Also, let's assume that $X$ has absolute finite moments of all order, i.e.,
$$ \int_\mathbb{R}|x|^nf_X(x)dx < \infty \quad \forall n \in N$$
This implies that $X$ has finite moments of all orders, i.e.
$$ m_n=\int_\mathbb{R}x^nf_X(x)dx < \infty \quad \forall n \in N$$
With these conditions, you can determine the family of orthogonal polynomials associated to $f_X(x)$, for example applying the Gram-Schmidt procedure to the monomial basis, or using the more stable Stieltjes procedure. The lognormal distribution satisfies these assumption, and thus there exist a set of orthonormal polynomials associated with it, the Stieltjes–Wigert polynomials.
However, this is not enough to be able to expand each mean-square integrable random variable $Y=g(X)$ in a series of orthogonal polynomials of $X$, which converges in square mean to $Y$. The reason is that with the conditions we stated, we cannot assure that the orthogonal polynomials are dense in $A=L^2(\Omega,\sigma(X),P)$, where $\Omega$ is the sample space of the probability space to which $X$ is associated, $P$ is the probability measure of the probability space and $\sigma(X)$ is the $\sigma$-algebra generated by $X$.
If this sounds too complicated, think of it like this: the space of all square-mean integrable random variables $Y$ which are a measurable function $g(X)$ of $X$ is a very large space. Unless you are sure that the polynomials orthonormal with respect to $f_X(x)$ are dense in such a space, there may be some $Y$ which cannot be expressed as a series of said polynomials.
Under our assumptions, a necessary and sufficient condition for the density of the orthonormal polynomials in $A$ is that $f_X(x)$ is determinate in the Hamburger sense, i.e., that it is uniquely determined by the sequence of its moments of all orders. The lognormal distribution is not determinate in the Hamburger sense, and thus there exist some square-mean integrable RVs $Y=g(X)$ such that the expansion of $Y$ in a series of Stieltjes–Wigert polynomials either doesn't converge, or it converges to a limit which is not $Y$ (recall that here convergence is always intended in the mean square sense). For some examples, again see the paper I linked above.
End of the story, for your random coefficient Poisson PDE it's probably better to represent $u$ as a series of Hermite polynomials. The convergence rate will be quite slow, though.
Best Answer
It really depends on your needs.
However, with regression and other "linear-model" problems (such as GLMs), the standard choice is orthogonal polynomials with respect to the observed set of $x$ values (usually just called "orthogonal polynomials" in regression-type contexts). Many packages provide them (e.g.
poly
in R provides such a basis - you supplyx
and the desired degree).That is, if $P$ is the resulting "x-matrix" (not counting the constant column) where the columns represent the linear, quadratic etc components, then $P^\top P=I$.
Like so:
(This property extends to the constant column if you appropriately normalize it, but it's usually left as-is, so the diagonal of $X^\top X$ would then have a $1,1$ element of $n$ rather than $1$.)
For that particular set of x-values*, they look like this:
These have a number of distinct advantages over other choices (including that the parameter estimates are uncorrelated).
Some references that may be of some use to you:
Sabhash C. Narula (1979),
"Orthogonal Polynomial Regression,"
International Statistical Review, 47:1 (Apr.), pp. 31-36
Kennedy, W. J. Jr and Gentle, J. E. (1980),
Statistical Computing, Marcel Dekker.
* on the off chance anyone cares about the particular values in the example: