Solved – How to implement a 2-D Gaussian Processes Regression through GPML (MATLAB)

gaussian processMATLABnormal distributionregression

I just touched Gaussian processes two weeks ago. I am not very familiar with the selection of a model and its hyperparameters. Here is the demo code that I run for a 2-D Gaussian processes regression. Its output is not what I expected.

% produce the training set for regression.
% Here, the regression target Y is the sum of input.
X1train = linspace(-4.5,4.5,10);
X2train = linspace(-4.5,4.5,10);
X = [X1train' X2train'];
Y = sum(X,2);

% produce the test set for regression
Xtest = [1 2];

% set the hyperparameters
covfunc = {@covMaterniso, 3};
ell = 1/4; sf = 1;
hyp.cov = log([ell; sf]);
likfunc = @likGauss;
sn = 0.1;
hyp.lik = log(sn);

% implement regression
[ymu ys2 fmu fs2] = gp(hyp, @infExact, [], covfunc, likfunc, X, Y, Xtest);

The result is: ymu = 0.131695275851991, but what I expected is ymu = 1 + 2 = 3

Best Answer

This seems to be a modelling problem rather than an implementation problem. Let us plot the predictions (ymu) from your model and training data with more points:

%More testdata
[Xtest1, Xtest2] = meshgrid(-4.5:0.1:4.5, -4.5:0.1:4.5);
 Xtest = [Xtest1(:) Xtest2(:)];

% implement regression 
[ymu ys2 fmu fs2] = gp(hyp, @infExact, [], covfunc, likfunc, X, Y, Xtest);

pcolor(Xtest1,Xtest2, reshape(ymu,size(Xtest1)))
shading('flat')

Model prediction with length-scale parameter 0.25

The axes of the image represent the inputs Xtest1,Xtest2, and the color is the predicted mean. We see that the model output deviates from 0 only near the training points. This is due to the short length scale hyperparameter (ell) that causes the model to learn only short range phenomena. Let us try modifying that (all other code kept unchanged)

ell = 4

Model output with lengthscale parameter 4

Now the predicted output is much closer to your ground truth. Indeed, the prediction at $(1,2)$ is now $2.9$. It seems that (with the training data kept fixed), the longer the length-scale, the closer the prediction at your testpoint gets to 3.

Another observation is that all your training data are located at the $x_1=x_2$ line, so it is rather optimistic to hope that any model would learn what happens outside that line without making strong assumptions. Besides changing the hyperparameters, you could also play with adding training data at various other points, e.g., in a 2-dimensional grid.