Solved – Repeated 100x 10-fold cross validation, what is the sample size when doing an significance test

cross-validationsample-sizestatistical significancet-test

I iterated my 10-fold cross validation 100 times for several methods. Now I want to use a t-test to test if the results are significant. However, I'm not sure what the sample size is.
Is the sample size the original amount of samples, or is it the original amount of samples x 100?

edit:

For university we need to classify 3 cancer types and give an estimation of how well our model will perform. We received a dataset with 100 samples. We split the data up into a training and test set using stratified sampling with a ratio of 0.3 and 0.7. The resulting training set consists of 69 samples, and the test set out of 31 samples.

We used repeated cross validation because of this paper:
http://www.cse.iitb.ac.in/~tarung/smt/papers_ppt/ency-cross-validation.pdf

The repeated cross-validation is done on the same training set, but with the folds are randomly chosen every time, so they should be different every time.

The significance we want to test is if the accuracy of one model is significantly better than the accuracy of a different model.

Best Answer

I think you can make such estimation. since different model use the same dataset, so the accuracy can be used to be compared. However, one important question, maybe, you need take the parameters of the models. whether these parameter will influence your conclusion.