Solved – Custom resampling method in caret

caretrresampling

I need to create a custom resampling method in R package caret where: For each leave-pair-out-cross-validation, from the training set I derive new data using a function I implemented. Then it is used as training set, and tested against the 2 instances left out. In the next iteration new data will be derived from the new training set resampled.

In the caret custom models manual page it is showed how to make a custom model but I would like to make a custom resampling method in the trainControl function that can be used with any train model. Is that possible in caret ?

So to recap, I need some sort of LGOCV that for each iteration takes the input training set and transforms it in a new dataframe with a function I created that will be used as actual training.

Best Answer

So, if you had 8 training set samples would this scheme result in choose(8,2) = 28 resamples? Also, I'm assuming that this isn't two nested leave-one-out loops.

If so, here is a solution that might breakdown with large sample sizes.

num_samps <- 8
holdout <- combn(num_samps, 2)
in_training <- apply(holdout, 2, 
                     function(x, all) all[!(all %in% x)],
                     all = 1:num_samps)

## need a more effcient way of doing this:
index <- vector(mode = "list", length = ncol(in_training))
for(i in 1:ncol(in_training)) index[[i]] <- in_training[,i]
## cosmetic:
names(index) <- caret:::prettySeq(seq(along = index))

ctrl <- trainControl(method = "cv", ## this will be ignored since  
                     ## we supply index below
                     index = index)  

Max

Related Question