Solved – How to prevent overfitting in Gaussian Process

gaussian processoverfittingregularizationscikit learn

I'm training Gaussian Process models on a relatively small data set, which have 8 input features and 75 input data.

I tried different kernels and find the following kernel (2 RBF + a white noise)works best.

from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import RBF, WhiteKernel

k1 = sigma_1**2 * RBF(length_scale=length_scale_1) 
k2 = sigma_2**2 * RBF(length_scale=length_scale_2)  
k3 = WhiteKernel(noise_level=sigma_3**2)  # noise terms

kernel = k1 + k2 + k3

I used 10-fold cv to calculate the R^2 score and find the averaged training R^2 is always > 0.999, but the averaged validation R^2 is about 0.65.

Looks like that the models are overfitted. I'm wondering what we could do to prevent overfit in Gaussian Process.

In linear regression, we can add regularization, and in neural network we can add regularization and dropout.

What about Gaussian Process?

Best Answer

Gaussian processes are sensible to overfitting when your datasets are too small, especially when you have a weak prior knowledge of the covariance structure (because the optimal set of hyperparameters for the covariance kernel often makes no sense).

Also, gaussian processes usually perform very poorly in cross-validation when the samples are small (especially when they were drawn from a space-filling design of experiment).

To limit overfitting:

set the lower bounds of the RBF kernels hyperparameters to a value as high as reasonably possible regarding your prior knowledge
try increasing (progressively) the noise kernel, or use sklearn's alpha parameter in GaussianProcessRegressor (increase the value corresponding to some training points where the GPR seems to overfit the most).

Best Answer

Related Solutions

Solved – How does regularization work for a Gaussian Process classification model

Solved – Gaussian Process Regression with wide confidence interval

Related Question