I am performing Gaussian process regression (GPR) and optimizing over hyper-parameters. I am using minFunc
to perform all optimizations. My question is should we (or rather, can we) standardize the data before giving it to the objective function? If we do standardize, then the hyper-parameters will be learned according to the standardized data. However, at test time, assuming we get samples one-by-one, it won't be possible to standardize each sample indepdently, right? (Unless, we use some standardizing factors from the training data). If it matters, all the elements in my data are between -1 to 1, however, some columns may have a very small mean and variance as compared to the other columns.
So my question is, should we normalize the data while doing GPR?
P.S. Actually, I observe some weird behavior if I don't standardize my data. For example, minFunc
suddenly gives me step direction is illegal
error. Some online reading led me to believe that there is either a problem in your gradient calculation or your data is not standardized. I am sure about my gradient function calculation, I have also check it with the DerivativeCheck
option. So, that leaves the possibility of data not being standardized.
Best Answer
Yes, it is desirable to standartize the data while learning Gaussian processes regression. There are a number of reasons:
Some packages do this job for you, for example GPR in sklearn has an option normalize for normalization of inputs, while not outputs (http://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.GaussianProcess.html)