Solved – Mean squared error of OLS smaller than Ridge

MATLABmseregressionridge regression

I am comparing the mean squared error (MSE) from a standard OLS regression with the MSE from a ridge regression. I find the OLS-MSE to be smaller than the ridge-MSE. I doubt that this is correct. Can anyone help me finding the mistake?

In order to understand the mechanics, I am not using any of Matlab's build-in functions.

% Generate Data. Note the high correlation of the columns of X. 
    X = [3, 3
        1.1 1
        -2.1 -2
        -2 -2]; 
    y =  [1 1 -1 -1]'; 

Here I set lambda = 1, but the problem appears for any value of lambda, except when lambda = 0. When lambda = 0, the OLS and the ridge estimates coincide, as they should.

    lambda1 = 1;
    [m,n] = size(X); % Size of X

OLS estimator and MSE:

    b_ols = ((X')*X)^(-1)*((X')*y);
    yhat_ols = X*b_ols;
    MSE_ols = mean((y-yhat_ols).^2)

Ridge estimator and MSE:

    b_ridge = ((X')*X+lambda1*eye(n))^(-1)*((X')*y);
    yhat_ridge = X*b_ridge;
    MSE_ridge = mean((y-yhat_ridge).^2)

For the OLS regression, MSE = 0.0370 and for the ridge regression MSE = 0.1021.

Best Answer

That is correct because $b_{OLS}$ is the minimizer of MSE by definition. The problem ($X^TX$ is invertible here) has only one minimum and any value other than $b_{OLS}$ will have higher MSE on the training dataset.

Related Question