Solved – Machine learning classifiers big-O or complexity

algorithmsclassificationmachine learningmultiple-comparisonstime complexity

To evaluate the performance a new classifier algorithm, I'm trying to compare the accuracy and the complexity (big-O in training and classifying). From Machine Learning: a review I get a complete supervised classifiers list, also a accuracy table between the algorithms, and 44 test problems from UCI data repositoy. However, I can't find a review, paper or web-site with the big-O for common classifiers like:

  • C4.5
  • RIPPER (I think this might not be possible, but who knows)
  • ANN with Back Propagation
  • Naive Bayesian
  • K-NN
  • SVM

If anyone has any expression for these classifiers, it will be very useful, thank you.

Best Answer

Let $N$ = number of training examples, $d$ = dimensionality of the features and $c$ = number of classes.

Then training has complexities:

  1. Naive Bayes is $O(Nd)$, all it needs to do is computing the frequency of every feature value $d_i$ for each class.
  2. $k$-NN is in $\mathcal{O}(1)$ (some people even say it is non-existent, but space complexity of training is $\mathcal{O}(Nd)$ since you need to store the data which also takes time).
  3. Nonlinear non-approximate SVM is $O(N^2)$ or $O(N^3)$ depending on the kernel. You can get a $O(N^3)$ down to $O(N^{2.3})$ with some tricks.
  4. Approximate SVM is $O(NR)$ where R is number of iterations.

Testing complexities:

  1. Naive Bayes is in $\mathcal{O}(cd)$ since you have to retrieve $d$ feature values for each of the $c$ classes.
  2. $k$-NN is in $\mathcal{O}(Nd)$ since you have to compare the test point to every data point in your database.

Source: "Core Vector Machines: Fast SVM Training on Very Large Data Sets" - http://machinelearning.wustl.edu/mlpapers/paper_files/TsangKC05.pdf

Sorry I don't know about the others.