I am evaluating a dataset of ~400 subjects and 10 covariates trying to fit a Cox ph model for predicting survival in AML patients.
To evaluate the models I am using a bootstrapping procedure of 50 iterations (sampling with replacement). Models are built on the training set and tested on the test set. For each model I calculate Harrell's C-Index (using rcorr.cens from the Hmisc package) and the Brier score (using the peperr package). I then average the results over the 50 iterations. I get an average C Index of 0.70 and a Brier score of 0.08.
My problem:
When I check the p values of the likelihood ratio test / Wald test of each individual model they are non-significant. Of note, this is a subset analysis. When I build models using other covariates in the dataset I get similar CIndex/Brier scores as well as highly significant likelihood ratio test/Wald tests.
Questions:
-
What could be the reason for the discrepancy between the evaluation on the testset (CIndex and Brier) and the evaluation on the training set (Likelihood ratio/Wald)
-
Some covariates are found to be significant in the Cox models (though the model itself is not). Is there any meaning for these covariate considering I can merge these results across 50 iterations ?
p.s. – I am aware that I am using the LASSO regularization (glmnet) for feature selection which is prone to overstating my covariates. This is a limitation of my analysis which I am attempting to minimize by the bootstrapping procedure. Regardless, I would like to figure out what is the reason for the discrepancy in my results.
I am attaching my code. Unfortunately the data are PHI and cannot be shared.
library("survival")
library ("glmnet")
library ("c060")
library ("peperr")
library ("Hmisc")
##-----------------------------------------------------------------------------------
## Accessory Functions
##-----------------------------------------------------------------------------------
##
## This function takes a predictor matrix and a Surv obj and uses the glmnet lasso
## regularization method to select the most predictive features and create a coxph
## object predict_matrix is the 'dataset' reformated to replace categorical variables
## to binary with dummy
dataset2predictmatrix <- function (dataset,nonzero.coef=NULL) {
#creat Y (survival matrix) for glmnet
surv_obj <- Surv(dataset$time,dataset$status)
## tranform categorical variables into binary variables with dummy for dataset
predict_matrix <- model.matrix(~ ., data=dataset,
contrasts.arg=lapply(dataset[,sapply(dataset, is.factor)],
contrasts))
## remove the statu/time variables from the predictor matrix (x) for glmnet
predict_matrix <- subset (predict_matrix, select=c(-time,-status))
if (!is.null(nonzero.coef)){
predict_matrix<-predict_matrix[,nonzero.coef]
}
list (predict_matrix=predict_matrix,surv_obj=surv_obj)
}
glmnetfun<-function (predict_matrix,surv_obj){
## create a glmnet cox object using lasso regularization and cross validation
glmnet.cv <- cv.glmnet (predict_matrix, surv_obj, family="cox")
## get the glmnet model
glmnet.obj <- glmnet.cv$glmnet.fit
# find lambda index for the models with least partial likelihood deviance (by cv.glmnet)
optimal.lambda <- glmnet.cv$lambda.min # For a more parsimoneous model use lambda.1se
lambda.index <- which(glmnet.obj$lambda==optimal.lambda)
# take beta for optimal lambda
optimal.beta <- glmnet.obj$beta[,lambda.index]
# find non zero beta coef
nonzero.coef <- abs(optimal.beta)>0
selectedBeta <- optimal.beta[nonzero.coef]
# take only covariates for which beta is not zero
selectedVar <- predict_matrix[,nonzero.coef]
# create a dataframe for trainSet with time, status and selected variables in binary representation for evaluation in pec
reformat_dataSet <- as.data.frame(cbind(surv_obj,selectedVar))
# create coxph object with pre-defined coefficients
glmnet.cox <- coxph(Surv(time,status) ~ .,data=reformat_dataSet,init=selectedBeta,iter=200)
# # glmnet.cox only with meaningful features selected by stepwise bidirectional AIC feature selection
# glmnet.cox.meaningful <- step(coxph(Surv(time,status) ~ .,data=reformat_dataSet),direction="both")
#
# selectedVarCox<- selectedVar [,attr(glmnet.cox.meaningful$terms,"term.labels")]
#
# Returned object -> list with glmnet.cv glmnet.obj selectecVar reformat_dataSet
list(glmnet.cv=glmnet.cv, glmnet.obj=glmnet.obj, glmnet.cox=glmnet.cox,
nonzero.coef=nonzero.coef, selectedVar=selectedVar,
reformat_dataSet=reformat_dataSet
# , glmnet.cox.meaningful=glmnet.cox.meaningful, selectedVarCox=selectedVarCox
)
}
## This function takes as input a surv_obj (testdate Surv object) and a testdata prediction matrix. Calculates a Brier score
## for either a cox model or a glmnet. Returns an ipec object which is the Brier score (requires peperr and c060)
brier_calc <- function (surv_obj, predict_matrix, model=c("cox","glmnet")){
if (model=="cox"){
peperr <- peperr.cox <- peperr(response=surv_obj, x=predict_matrix,
fit.fun=fit.coxph, load.all=TRUE,
indices=resample.indices(n=nrow(surv_obj), method="boot", sample.n=50))
} else if (model=="glmnet"){
peperr <- peperr(response=surv_obj, x=predict_matrix,
fit.fun=fit.glmnet, args.fit=list(family="cox"),
complexity=complexity.glmnet,args.complexity=list(family="cox"),load.all=TRUE,
indices=resample.indices(n=nrow(surv_obj), method="boot", sample.n=50))
}
# Get error predictions from peperr
prederr <- perr(peperr)
# Integrated prediction error Brier score calculation
ipec<-ipec(prederr, eval.times=peperr$attribute, response=surv_obj)
list (peperr=peperr,ipec=ipec)
}
trainset<-dataset2predictmatrix (dataset)
models<-glmnetfun (trainset$predict_matrix,trainset$surv_obj)
# rfsrc<-rfsrcfunc (dataset)
briercox<-brier_calc (trainset$surv_obj,models$selectedVar,model="cox")
# requires c060 package which doesn't work on Linux
brierglmnet<-brier_calc (trainset$surv_obj,trainset$predict_matrix,model="glmnet")
# briermeaningful <- brier_calc (trainset$surv_obj,models$selectedVarCox,model="cox")
# brierRFSRC <-pec_calc(rfsrc$obj,dataset)
# brierRfit<-pec_calc(rfsrc$rfit ,rfsrc$selectedDataset)
# calculate brier score for coxph with feature selection based on rfsrc
# selectedDataset<-dataset2predictmatrix (rfsrc$selectedDataset)
# brierCox.rfsrc<-brier_calc(selectedDataset$surv_obj,selectedDataset$predict_matrix ,model="cox")
# save image of the workspace after each iteration
save.image(file)
## C-Index calculation 50 iter bootstrapping
## initialization of lists for various outputs
cIndexCox<- list ()
cIndexglmnet<- list ()
# cIndexCoxAIC<- list ()
# cIndexRFSRC<- list()
# cIndexRfit<- list ()
# cIndexCox.rfsrc<- list ()
i<-1
# pb <- txtProgressBar(min = 1, max = 50, style = 3) ## initialize a progress bar
train_results_list<-list()
# rfsrc_train_list<- list ()
while (i < 51){
print (paste("Iteration:",i))
# setTxtProgressBar(pb, i)## run progress bar
train <- sample(1:nrow(dataset), nrow(dataset), replace = TRUE) ## random sampling with replacement
# create a dataframe for trainSet with time, status and selected variables in binary representation for evaluation in pec
trainset <- dataset [train,]
testset<- dataset [-train,]
train_matrix <- dataset2predictmatrix (trainset)
train_results<-glmnetfun(train_matrix$predict_matrix,train_matrix$surv_obj) #runs the sequence of glmnet.cv->glmnet.obj->glmnet.cox
train_results_list<-c(train_results_list,list(train_results)) #create a list of results of all runs
# compute c-index (Harrell's) for cox-glmnet models
if (!is.null(train_results$glmnet.cox)){
# creates a matrix of only the variables identified by the glmnet
test_matrix <- dataset2predictmatrix(testset, nonzero.coef=train_results$nonzero.coef)
reformat_testSet <- as.data.frame(cbind(test_matrix$surv_obj,test_matrix$predict_matrix)) #casts matrix to dataframe
cIndexCox <- c(cIndexCox,
1-rcorr.cens(predict(train_results$glmnet.cox, reformat_testSet),test_matrix$surv_obj)[1])
i=i+1
# compute c-index (Harrell's) for glmnet models
test_matrix<- dataset2predictmatrix (testset) # create matrix of the entire testset for glmnet evaluation
cIndexglmnet <- c(cIndexglmnet,
1-rcorr.cens(predict(train_results$glmnet.cv, test_matrix$predict_matrix),test_matrix$surv_obj)[1])
save.image(file)
}
}
meanCICox<- c(mean (unlist(cIndexCox),na.rm=TRUE),std.error(unlist(cIndexCox)))
meanCIglmnet<- c(mean (unlist(cIndexglmnet),na.rm=TRUE),std.error(unlist(cIndexglmnet)))
#
Best Answer
When using the Brier score (for which you have to use a special version to account for censoring - I assume you did that) or the $c$-index, the corresponding log-likelihood-based statistic is the likelihood ratio $\chi^2$ statistic for the full model. Do not look at partial $\chi^2$. Note that except for the Brier score, the R
rms
packagevalidate
function does the calculations and bootstrapping you need, automatically.