Solved – Implementation of Gaussian process

gaussian processjava

i try to implement a simple gaussian process regression in java. I almost got every step from the book http://www.GaussianProcess.org/gpml .

With my implementation of algorithm 2.1 on page 19 i'm able to produce results but:

I have the strange behavior that all predicted datapoints where i do not have any target tend to be of zero value. Does anyone know this behavior or can explain where i could have taken a mistake?

In my plot the red line marks the predicted values, black are the targets.

Here is the prediction and covariance java code. The parameter array includes length scale and noise, I work with the idexes of the arrays as x inputs for the covaiance function:

private double generateCOV_RADIAL(int i, int j) {

    double covar = Math.pow(parameter[1], 2.0) * Math.exp(-1.0 * (Math.pow(i - j, 2.0)/(2 * Math.pow(parameter[0], 2.0))));

    if(i==j)
            covar += Math.pow(parameter[2], 2); 

    return covar;
}
public void predict(int predictFuture) throws IllegalArgumentException, IllegalAccessException, InvocationTargetException{

    RealMatrix y = MatrixUtils.createColumnRealMatrix(this.targets);

    double mean = StatUtils.mean(this.targets);

    for(int i = 0; i < this.targets.length;i++)
    {
        targets[i] -= mean;
    }


    RealMatrix K =  MatrixUtils.createRealMatrix(cov);


    //identity matrix for I
    RealMatrix k_eye = MatrixUtils.createRealIdentityMatrix(cov.length);


    //choleski(K + sigman^2*I)
    CholeskyDecomposition L = null;
    try {
        L = new CholeskyDecompositionImpl(
                K.add(
                        k_eye.scalarMultiply(Math.pow(parameter[2], 2))
                        )
                );
    } catch (NonSquareMatrixException e) {
        e.printStackTrace();
    } catch (NotSymmetricMatrixException e) {
        e.printStackTrace();
    } catch (NotPositiveDefiniteMatrixException e) {
        e.printStackTrace();
    }

    //inverse of Ltranspose for left devision
    RealMatrix L_transpose_1 = new LUDecompositionImpl(L.getLT()).getSolver().getInverse();
    //inverse of Ltranspose for left devision
    RealMatrix L_1 = new LUDecompositionImpl(L.getL()).getSolver().getInverse();



    //alpha = L'\(L\y)
    RealMatrix alpha = L_transpose_1.multiply(L_1).multiply(y);


    double L_diag = 0.0;

    for(int i = 0; i < L.getL().getColumnDimension();i++)
    {
        L_diag += Math.log(L.getL().getEntry(i, i)); 
    }

    double logpyX = - y.transpose().multiply(alpha).scalarMultiply(0.5).getData()[0][0]
                    - L_diag
                    - predictFuture * Math.log(2 * Math.PI) * 0.5;


    double[] fstar = new double[targets.length + predictFuture];
    double[] V = new double[targets.length + predictFuture];

    for(int i = 0;i < targets.length + predictFuture;i++)
    {

        double[] kstar = new double[targets.length];

        for(int j = 0; j < targets.length;j++)
        {
            double covar = (Double)covMethod.invoke(this,j,i);
            kstar[j] = covar;
        }

        //f*=k_*^T * alpha
        fstar[i] = MatrixUtils.createColumnRealMatrix(kstar).transpose().multiply(alpha).getData()[0][0];
        fstar[i] += mean;
        //v = L\k_*
        RealMatrix v = L_1.multiply(MatrixUtils.createColumnRealMatrix(kstar));

        //V[fstar] = k(x_*,x_*) - v^T*v
        double covar = (Double)covMethod.invoke(this,i,i);

        V[i] = covar - v.transpose().multiply(v).getData()[0][0] + Math.pow(parameter[2],2);
    }

    this.predicted_mean = fstar;
    this.predicted_variance = V;
}

Thank you

GP Example

Best Answer

I suspect this is because you are using a Gaussian process with a zero mean function, so that unless the covariance function is non-local, the output will go to zero away from the datapoints. If you are using a local covariance function, such as the squared exponential (RBF), it is a prior over functions that says that the function should be smooth, i.e. the value of the function should be similar to is value in nearby locations. If there are no nearby samples, then th prior says very little about the value of the function as there is no reference. Thus you get a smooth function (you can get no smoother than a straight line) and you just get the mean.

If you want to extrapolate from your model, you need to have a covariance function that tells you what to expect at least over the distance you are extrapolating. A polynomial kernel, or and RBF kernel with a broader length scale may help.

Related Question