Machine Learning – Classifier vs Model vs Estimator: Understanding the Key Differences

machine learning

What is the difference between a classifier, model and estimator?

From what I can tell:

  • an estimator is a predictor found from regression algorithm
  • a classifier is a predictor found from a classification algorithm
  • a model can be both an estimator or a classifier

But from looking online, it appears that I may have these definitions mixed up. So, what the true defintions in the context of machine learning?

Best Answer

  • estimator: This isn't a word with a rigorous definition but it usually associated with finding a current value in data. If we didn't explicitly count the change in our pocket we might use an estimate. That said, in machine learning it is most frequently used in conjunction with parameter estimation or density estimation. In both cases there is an assumption that data we currently have comes in a form that can be described with a function. With parameter estimation, we believe that the function is a known function that has additional parameters such as rate or mean and we may estimate the value of those parameters. In density estimation we may not even have an assumption about the function but we will attempt to estimate the function regardless. Once we have an estimation we may have at our disposal a model. The estimator then would be the method of generating estimations, for example the method of maximum likelihood.
  • classifier: This specifically refers to a type of function (and use of that function) where the response (or range in functional language) is discrete. Compared to this a regressor will have a continuous response. There are additional response types but these are the two most well known. Once we may have built a classifier, it is expected to predict for us from within a finite range of classes which class a vector of data is likely to indicate. As an example a voice recognition software may record a meeting and attempt to record at any given time which of the finite number of meeting attendees are speaking. Building this software we would give each attendee a number that is nominal only and attempt to classify to that number for each segment of speech.
  • model: The model is the function (or pooled set of functions) that you may accept or reject as being representative of your phenomenon. The word stems from the idea that you may apply domain knowledge to explaining/predicting the phenomenon though this isn't required. A non-parametric model might be derived entirely from the data at hand but the result is often still called a model. This terminology highlights the fact that what has been constructed when a model has been constructed is not reality but only a 'model' of reality. As George Box has said "All models are wrong but some are useful". Having a model allows you to predict but that may not be its purpose; it could also be used to simulate or to explain.