Solved – What are acceptable validation or cross validation error rates

cross-validationerrormachine learningvalidation

Is there a commonly acceptable error rate for validation? As in, if the error rate is less than X %, then my machine learning method would be considered "successful".

I'm looking for something analogous to a p-value of 0.05 used for many experiments, but for cross-validation.

I would use 5% as an error rate but thats really hard to achieve especially if you have small training and validation sets. (I only have 6 subjects total).

Best Answer

It's a bit hard to get a p-value below 5% when each sample represents 16.7% of the data ! But even with a very large sample size, there's no such thing as a "universal" acceptable error rate that would be suitable for all applications. The expected MSE of an estimator can be decomposed as bias^2 + variance + noise. So even a "perfect" learning machine will not allow you to get rid of the noise term which is application dependent. Intuitively, noise comes from the fact that the underlying data generating process is non-deterministic, i.e., $y = f(x) + \epsilon$, i.e., you may get different values of y (the target) for two samples with the exact same x (vector of inputs). The best (in MSE terms) predictor will be $\hat{y} = f(x)$ with error equal to $Var(\epsilon)$.