Solved – Gaussian Process Regression with positive prediction weights

constrained regressiongaussian processregressionsampling

I want to do Gaussian Process Regression for a density function $f(x)$ with a gaussian kernel function $k(x, x')$.

Given the training data $\mathbf{x} = (x_1, x_2, \dots, x_N)$ and $\mathbf{f} = (f(x_1), f(x_2), \dots, f(x_N))$, the prediction at a new point $x^*$ is:
$$
f(x^*) = \mathbf{k}^T \mathbf{C}_N^{-1}\mathbf{f}
$$
where
$$
\mathbf{k} =
\begin{bmatrix}
k(x_1, x^*)\\
\vdots\\
k(x_N, x^*)
\end{bmatrix}
\mathbf{C}_N =
\begin{bmatrix}
k(x_1, x_1) &\dots &k(x_1, x_N)\\
&\vdots\\
k(x_N, x_1) &\dots &k(x_N, x_N)
\end{bmatrix}
$$
As mentioned in Bishop's machine learning book, I can rewrite
$$
f(x^*) = \sum_{i = 1}^N a_i k(x_i, x^*)
$$
with $a_i$ is the $i^{th}$ component of $\mathbf{C}_N^{-1}\mathbf{f}$.

As I use an gaussian kernel $k$, I can directly sample my density function, if all $a_i$ are positive.

My questions is if there exists a way to constraint positive $a_i$ in the GP regression?

Best Answer

I see a practical (read: quick and dirty) way to approach this. But in general Gaussian processes seem to be bad models for densities. Gaussian processes at training and prediction points are multivariate normal random variables, which means they cannot be positive with certainty.

If you are only interested in the mean function, a quick and dirty way to get around this might be to impose a non-zero prior mean. This might lift your mean function in areas where you have little data above zero. If this makes sense or not depends on the nature of your problem. You might also want to use the constraint that the paths integrate to one. This constraint can be implemented in a clean way (see Chapter 9.4. of Rasmussen, Williams, where they discuss derivatives but integrals are the same in spirit)

Depending on what you need an even better approach might be to regress on $\log(f)$ instead of $f$.

Related Question