I am trying to compare OLR, ridge and lasso in my situation. I could calculate SE for OLR and lasso but not for ridge. The following is Prostrate
data from lasso2
package.
require(lasso2)
require(MASS)
data(Prostate)
fit.lm = lm(lpsa~.,data=Prostate)
summary(fit.lm)$coefficients
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.669399027 1.296381277 0.5163597 6.068984e-01
lcavol 0.587022881 0.087920374 6.6767560 2.110634e-09
lweight 0.454460641 0.170012071 2.6731081 8.956206e-03
age -0.019637208 0.011172743 -1.7575995 8.229321e-02
lbph 0.107054351 0.058449332 1.8315753 7.039819e-02
svi 0.766155885 0.244309492 3.1360054 2.328823e-03
lcp -0.105473570 0.091013484 -1.1588785 2.496408e-01
gleason 0.045135964 0.157464467 0.2866422 7.750601e-01
pgg45 0.004525324 0.004421185 1.0235545 3.088513e-01
fit.rd = lm.ridge(lpsa~.,data=Prostate, lamda = 6.0012)
#summary(fit.rd)$coefficients, doesnot provide SE
lfit = l1ce(lpsa~.,data=Prostate,bound=(1:500)/500)
summary(lfit[[10]])$coefficients
Value Std. Error Z score Pr(>|Z|)
(Intercept) 2.43614448 2.130515543 1.1434530 0.2528505
lcavol 0.03129045 0.125288320 0.2497475 0.8027826
lweight 0.00000000 0.274549270 0.0000000 1.0000000
age 0.00000000 0.018287840 0.0000000 1.0000000
lbph 0.00000000 0.095587974 0.0000000 1.0000000
svi 0.00000000 0.390936045 0.0000000 1.0000000
lcp 0.00000000 0.149824868 0.0000000 1.0000000
gleason 0.00000000 0.260274039 0.0000000 1.0000000
pgg45 0.00000000 0.007285054 0.0000000 1.0000000
I have a couple of questions:
(1) How can we calculate Std. Error in case of ridge regression ?
(2) Is it valid to compare Std. Error for deciding which (ridge
, lasso
or OLS
) method to use ? Or there are other methods ? If so how can I get them ?
Best Answer
Since the goal is prediction you can choose some distance measure and then calculate the distance between your predictions $\hat{Y}$ and the true values $Y$, choosing the method with the minimum distance. The most common distance measurement for this setting is the mean squared error: MSE = $\sum_{i=1}^n \left(\hat{Y_i} - Y_i\right)^2$. The hardest part is appropriately choosing your training and testing sets and then performing model selection procedure required by the Lasso and Ridge regularization.
First, split your data into two parts: training $X_{training}$ and testing $X_{testing}$. A common choice is 66% training, 34% testing but the choice of proportion is influenced by the size of your data set. Now forget about the testing set for a while. Train, or fit, the three models using just the training data. Model selection for the regularization parameter $\lambda$ should be chosen via cross-validation since prediction is your goal here. Finally, using the best $\lambda$, perform prediction on the testing data to obtain values for $\hat{Y}$. I haven't used the
Lasso2
orMASS
implementations of regularized regression but theglmnet
package makes the process very straightforward because it provides a cross-validatiion function. Here's some R code to do this on theProstate
data:Note that the
sample()
function randomly chooses the rows for training and testing so the MSE will change every time you run it. And since theProstate
data is really just meant as a demonstration dataset, no clear winner is likely to emerge.