You can save yourself some coding effort, surprisingly enough, by simply doing the leave-one-out (LWO) cross-validation yourself:
data1$norm.wt <- data1$wt / sum(data1$wt)
lwo <- rep(0,nrow(data1))
for (j in 1:nrow(data1))
{
fj <- glm(y~x1+x2, family=quasibinomial, weights=norm.wt, data=data1[-j,])
lwo[j] <- predict(fj, data1[j,], type="response")
}
> print(lwo, digits=4)
[1] 2.564e-02 2.220e-16 4.405e-01 2.128e-07 7.360e-01 5.360e-01 2.383e-02
[8] 2.064e-02 1.316e-01 2.174e-02 4.152e-01 1.895e-01 7.162e-01
Normalizing the weights to sum to one prevents a numerical problem (in this case) that results in your parameter estimates blowing up:
> summary(glm(y~x1+x2, family=quasibinomial, weights=wt, data=data1))
Call:
glm(formula = y ~ x1 + x2, family = quasibinomial, data = data1,
weights = wt)
Deviance Residuals:
Min 1Q Median 3Q Max
-394.9 0.0 0.0 0.0 164.4
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.790e+15 2.354e+15 -2.885 0.0162 *
x1 2.004e+15 1.630e+15 1.230 0.2470
x2 3.403e+15 1.393e+15 2.443 0.0347 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for quasibinomial family taken to be 2.030037e+18)
Null deviance: 25460 on 12 degrees of freedom
Residual deviance: 324940 on 10 degrees of freedom
AIC: NA
Number of Fisher Scoring iterations: 12
versus normalized weights:
> summary(glm(y~x1+x2, family=quasibinomial, weights=data1$wt/sum(data1$wt), data=data1))
Call:
glm(formula = y ~ x1 + x2, family = quasibinomial, data = data1,
weights = data1$wt/sum(data1$wt))
Deviance Residuals:
Min 1Q Median 3Q Max
-0.4273 -0.1004 -0.0444 0.1706 0.3879
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -8.038 6.978 -1.152 0.276
x1 1.631 3.177 0.513 0.619
x2 4.142 3.536 1.171 0.269
(Dispersion parameter for quasibinomial family taken to be 0.1440099)
Null deviance: 1.30513 on 12 degrees of freedom
Residual deviance: 0.84517 on 10 degrees of freedom
AIC: NA
Number of Fisher Scoring iterations: 6
You don't have to renormalize the weights at every step of the LWO loop, in effect they are renormalized anyway as the weights are relative.
This doesn't match the SAS probabilities, admittedly, but it seems to me it's what you're trying to do.
In Friedman, Hastie, and Tibshirani (2010), the deviance of a binomial model, for the purpose of cross-validation, is calculated as
minus twice the log-likelihood on the left-out data (p. 17)
Given that this is the paper cited in the documentation for glmnet
(on p. 2 and 5), that is probably the formula used in the package.
And indeed, in the source code for function cvlognet
, the deviance residuals for the response are calculated as
-2*((y==2)*log(predmat)+(y==1)*log(1-predmat))
where predmat
is simply
predict(glmnet.object,x,lambda=lambda)
and passed in from the encolsing cv.glmnet
function. I used the source code available on the JStatSoft page for the paper, and I don't know how up-to-date that code is. The code for this package is surprisingly simple and readable; you can always check for yourself by typing glmnet:::cv.glmnet
.
Best Answer
I was looking for the answer to the same question and I took a look into the code (v1.8-2) in http://cran.r-project.org/web/packages/glmnet/index.html .
The file "cv.elnet.R" (in your case cv.lognet.r) contains the relevant pieces of code.
So the package does use the weights in the computation of the figures of merit even on the held-out data. (You will find similar lines in cv.lognet.r as well.)