The larger my test set is, the smaller gets the train set, so I discard potential information. Can this be solved via a "stacked" n-fold cv?
Yes. It is usually called nested or double cross validation, and we have a number of questions and answers about that. You could start e.g. with Nested cross validation for model selection
Do I really have to make a REPEATED n-fold cv? Are there other possibilities?
Repetitions / iterations in resampling validation help only if the (surrogate) models are unstable. If you are really sure your models are stable (but how can you be when having concerns about small sample size?) then you don't need the iterations / repetitions. OTOH, IMHO the easiest way to prove that the models are stable is running a few iterations and look at the stability of the predictions.
Is the error rate an appreciate loss function or should I choose another one (eg. the empirical error function or MSE, but then I'd need a probability output, right?)?
No, overall error rate is not a very good loss function, particularly not for optimization. MSE is much better, it is a proper scoring rule. Yes, proper scoring rules need probability output.
However, SVM are anyways quite ugly to optimize as they do not react continuously to small continuous changes in the training data + hyperparameters: up to a certain limit nothing changes (i.e. the same cases stay support vectors), then suddenly the support vectors change.
See also
Nested cross-validation and repeated k-fold cross-validation have different aims. The aim of nested cross-validation is to eliminate the bias in the performance estimate due to the use of cross-validation to tune the hyper-parameters. As the "inner" cross-validation has been directly optimised to tune the hyper-parameters it will give an optimistically biased estimate of generalisation performance. The aim of repeated k-fold cross-validation, on the other hand, is to reduce the variance of the performance estimate (to average out the random variation caused by partitioning the data into folds). If you want to reduce bias and variance, there is no reason (other than computational expense) not to combine both, such that repeated k-fold is used for the "outer" cross-validation of a nested cross-validation estimate. Using repeated k-fold cross-validation for the "inner" folds, might also improve the hyper-parameter tuning.
If all of the models have only a small number of hyper-parameters (and they are not overly sensitive to the hyper-parameter values) then you can often get away with a non-nested cross-validation to choose the model, and only need nested cross-validation if you need an unbiased performance estimate, see:
Jacques Wainer and Gavin Cawley, "Nested cross-validation when selecting classifiers is overzealous for most practical applications", Expert Systems with Applications, Volume 182, 2021 (doi, pdf)
If, on the other hand, some models have more hyper-parameters than others, the model choice will be biased towards the models with the most hyper-parameters (which is probably a bad thing as they are the ones most likely to experience over-fitting in model selection). See the comparison of RBF kernels with a single hyper-parameter and Automatic Relevance Determination (ARD) kernels, with one hyper-parameter for each attribute, in section 4.3 my paper (with Mrs Marsupial):
GC Cawley and NLC Talbot, "On over-fitting in model selection and subsequent selection bias in performance evaluation", The Journal of Machine Learning Research 11, 2079-2107, 2010 (pdf)
The PRESS statistic (which is the inner cross-validation) will almost always select the ARD kernel, despite the RBF kernel giving better generalisation performance in the majority of cases (ten of the thirteen benchmark datasets).
Best Answer
The first approach is actually hold out evaluation (although CV is used for tuning) and the second approach is cross validation IF you just consider the hyperparameters (eg, the feature importance and number of features and K, etc.) to be parameters of some modeling process that you intend to evaluate using cross validation. This is explained well in How to get hyper parameters in nested cross validation?.
If conceptualized this way, the answers in Hold-out validation vs. cross-validation become directly relevant. Some major benefits:
If you use hold out, you "lose" the testing data (in contrast, CV allows you to make statements about the generalization error of the model trained on the full dataset, so you don't waste any data). Sample size is a major consideration here, and I think with 150 observations the recommendation would be to use CV.
CV with its multiple folds gives a sense of the variability of the feature selection/hyperparameter optimization process as well as some measure of variability of performance. Clearly a modeling process with 0.90 accuracy $\pm$ 0.20 is not the same as 0.90 $\pm$ 0.02.
Another method that gives similar benefits is bootstrap: see Cross-validation or bootstrapping to evaluate classification performance?. This page also discusses that accuracy even without class imbalance is a poor scoring rule.
One difficulty with CV for modeling process evaluation (eg, nested CV) is that it requires that you, expectedly, automate your entire modeling process. So, anything that is subjective or manual is pretty much out of the question. Sometimes, domain expertise can only be integrated manually. Further, hyperparameter search must be automated, which is fairly easy, but so must be the search for the hyperparameter search space. For example, if you find in some fold F that your chosen K (for kNN) is actually at the border of your search space, you might want to expand the search space. If you don't do this, your comparison between kNN and SVM will not be valid because it's possibly that you gave SVM a better search space than you gave kNN. This search space expansion can only be done within fold F; there will be leakage if you have a globally defined search space used for all the folds that you change after seeing this (see Does changing the parameter search space after nested CV introduce optimistic bias?). This might take much longer to run (and be considerably more difficult to program) than a simple hold out.