Solved – Is Gaussian Process Regression a linear model

gaussian processlinear modelregression

I had a discussion today with someone saying that Gaussian Processes are linear models. I don't see in which sense this may be correct. To be clear, here the definition of a linear model is the usual one, i.e., a model which is linear in the parameters. Thus,

$y=\beta_0+\beta_1x+\beta_2\sin(x)+\epsilon$

and

$y=\beta_0+\boldsymbol{\beta}^T\cdot\mathbf{x}+\epsilon$

are linear, and

$y=\beta_0+\beta_1x+\beta_2\exp({\beta_3x})+\epsilon$

is not.

For simplicity, let's consider a Squared Exponential covariance function, and assume that the correlation length, the signal variance and the noise variance are known. Given a design matrix $X$ and corresponding response vector $\mathbf{y}$, the GP prediction at a new prediction point $\mathbf{x}^*$ is

$$\hat{y}(\mathbf{x}^*)=\mathbf{k}_*^T(K+\sigma I)^{-1}\mathbf{y}$$

Now, this estimator is clearly a nonlinear function of $X$ and a linear function of $\mathbf{y}$. The other person insisted that $\mathbf{y}$ is the parameter vector of this model, and thus the model is linear. I don't think this makes any sense: it would mean that the number of parameters of the model depends on the sample size. I think we can at most say that the estimator is a linear function
of $\mathbf{y}$, but surely not the statistical model underlying Gaussian Process Regression. Do you agree?

Best Answer

I think the technically correct term to use here is that GP regression is a linear smoother, i.e. its predictions are a linearly weighted combination of past observed outputs. This does not make the model as such linear. For that to be true, the predictions must be a linear function of the inputs. This is only the case with GPs if you use a linear covariance function.

Related Question