Gaussian Process Regression – Should Data Be Standardized?

I am performing Gaussian process regression (GPR) and optimizing over hyper-parameters. I am using minFunc to perform all optimizations. My question is should we (or rather, can we) standardize the data before giving it to the objective function? If we do standardize, then the hyper-parameters will be learned according to the standardized data. However, at test time, assuming we get samples one-by-one, it won't be possible to standardize each sample indepdently, right? (Unless, we use some standardizing factors from the training data). If it matters, all the elements in my data are between -1 to 1, however, some columns may have a very small mean and variance as compared to the other columns.

So my question is, should we normalize the data while doing GPR?

P.S. Actually, I observe some weird behavior if I don't standardize my data. For example, minFunc suddenly gives me step direction is illegal error. Some online reading led me to believe that there is either a problem in your gradient calculation or your data is not standardized. I am sure about my gradient function calculation, I have also check it with the DerivativeCheck option. So, that leaves the possibility of data not being standardized.

Best Answer

Yes, it is desirable to standartize the data while learning Gaussian processes regression. There are a number of reasons:

In common Gaussian processes regression model we suppose that output $y$ has zero mean, so we should standartize $y$ to match our assumption.
For many covariance function we have scale parameters in covariance functions. So, we should standartize inputs to get better estimation of parameters of covariance functions.
Gaussian processes regression is prone to numerical problems as we have to inverse ill-conditioned covariance matrix. To make this problem less severe, you should standartize your data.

Some packages do this job for you, for example GPR in sklearn has an option normalize for normalization of inputs, while not outputs (http://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.GaussianProcess.html)

Best Answer

Related Solutions

Solved – Using Gaussian process regression with non Gaussian data

Solved – Learning of hyperparameters for Gaussian process

Related Question