This question stems from Pedro Domingos' excellent paper "A Few Useful Things to Know About Machine Learning." The paper is extremely clear and well-written, but I still have a clarification question. Namely, what is the difference between what Domingos describes on page 1 as a "learner" and a "classifier"? I have largely taken these to by synonymous. The sentence that threw me for a loop in the paper is (again, on page 1): "The test of the learner is whether this classifier produces the correct output yt for future examples xt." I thought this was simply the test of the classifier. Any clarification or further reading suggestions would be greatly appreciated.
Solved – the difference between a “learner” and “classifier” in supervised learning
classificationmachine learningsupervised learning
Related Solutions
Generally, the problems of machine learning may be considered variations on function estimation for classification, prediction or modeling.
In supervised learning one is furnished with input ($x_1$, $x_2$, ...,) and output ($y_1$, $y_2$, ...,) and are challenged with finding a function that approximates this behavior in a generalizable fashion. The output could be a class label (in classification) or a real number (in regression)-- these are the "supervision" in supervised learning.
In the case of unsupervised learning, in the base case, you receives inputs $x_1$, $x_2$, ..., but neither target outputs, nor rewards from its environment are provided. Based on the problem (classify, or predict) and your background knowledge of the space sampled, you may use various methods: density estimation (estimating some underlying PDF for prediction), k-means clustering (classifying unlabeled real valued data), k-modes clustering (classifying unlabeled categorical data), etc.
Semi-supervised learning involves function estimation on labeled and unlabeled data. This approach is motivated by the fact that labeled data is often costly to generate, whereas unlabeled data is generally not. The challenge here mostly involves the technical question of how to treat data mixed in this fashion. See this Semi-Supervised Learning Literature Survey for more details on semi-supervised learning methods.
In addition to these kinds of learning, there are others, such as reinforcement learning whereby the learning method interacts with its environment by producing actions $a_1$, $a_2$, . . .. that produce rewards or punishments $r_1$, $r_2$, ...
1.
Lets look at a simple example of trying to predict housing prices. Assume we have a dataset that looks like
Cost | Sq Ft | N bedroom
100K 1,800 4
120K 1,300 3
220K 2,200 5
In the case of supervised learning we would know the cost (these are our y labels) and we would use our set of features (Sq ft and N bedrooms) to build a model to predict the housing cost. The formula would look like
Cost ~ Sq Ft + N bedrooms
Now in unsupervised learning we would not know the cost of the house but we still would know the features. Therefore, we would train a model and try to group the types of houses together that are similar. For an example of this look at k-means clustering (http://scikit-learn.org/stable/modules/clustering.html#clustering)
This is a great, free, book which covers this very nicely (http://web.stanford.edu/~hastie/local.ftp/Springer/OLD/ESLII_print4.pdf)
2.
Each type of learning method (may) have a set of parameters which are called model parameters. The training phase is used to find out the optimal set of parameters which generalizes you data the best. That book also gives very nice information on different learning methods are there parameters.
For example, in the leaning algorithm called SVM there is a term that looks like $\exp(-\gamma|x-x^{2}|)$. In this example the $\gamma$ parameter is what we try to optimize using the training data.
Best Answer
I use to find this kind of difference mostly in the world of programmers interested in machine learning. This difference does not appear in the world of statisticians and researchers.
The idea is that a classifier is a program built by a learner. An illuminating intuition comes from one of the many definitions of machine learning which is programming with data. So, you have data (training set) and from that data using a computer program you build another program (for example a decision tree). The program which builds the decision tree from data is the learner. The decision tree is a classifier, because a classifier is a program which is able to predict, which takes only the input data and for each instance it produces the output data.
An alternative way to understand this is that a learner takes the input $x_1,x_2,..,x_p,y$ and produces a classifier. A classifier takes as input $x'_1,x'_2,..,x'_p$ and produces $y'$.
As I said, in research papers this distinction is hard to find. It seems that the researchers are interested only in how to describe the model. When they come to describe how to build that model, then they talk about a learner, and when they talk about how to predict with that model, then they talk about a classifier. So, a third alternative is a functional one. The function of fitting a model is the function of a learner, while the function of predicting values is a function of a classifier.
Note that a regressor is the same as a classifier, only the nature of the output is different.