Machine Learning – Objective Function, Cost Function, and Loss Function Differences

artificial intelligencemachine learningterminology

In machine learning, people talk about objective function, cost function, loss function. Are they just different names of the same thing? When to use them? If they are not always refer to the same thing, what are the differences?

Best Answer

These are not very strict terms and they are highly related. However:

  • Loss function is usually a function defined on a data point, prediction and label, and measures the penalty. For example:
  • square loss $l(f(x_i|\theta),y_i) = \left (f(x_i|\theta)-y_i \right )^2$, used in linear regression
  • hinge loss $l(f(x_i|\theta), y_i) = \max(0, 1-f(x_i|\theta)y_i)$, used in SVM
  • 0/1 loss $l(f(x_i|\theta), y_i) = 1 \iff f(x_i|\theta) \neq y_i$, used in theoretical analysis and definition of accuracy
  • Cost function is usually more general. It might be a sum of loss functions over your training set plus some model complexity penalty (regularization). For example:
  • Mean Squared Error $MSE(\theta) = \frac{1}{N} \sum_{i=1}^N \left (f(x_i|\theta)-y_i \right )^2$
  • SVM cost function $SVM(\theta) = \|\theta\|^2 + C \sum_{i=1}^N \xi_i$ (there are additional constraints connecting $\xi_i$ with $C$ and with training set)
  • Objective function is the most general term for any function that you optimize during training. For example, a probability of generating training set in maximum likelihood approach is a well defined objective function, but it is not a loss function nor cost function (however you could define an equivalent cost function). For example:
  • MLE is a type of objective function (which you maximize)
  • Divergence between classes can be an objective function but it is barely a cost function, unless you define something artificial, like 1-Divergence, and name it a cost

Long story short, I would say that:

A loss function is a part of a cost function which is a type of an objective function.

All that being said, thse terms are far from strict, and depending on context, research group, background, can shift and be used in a different meaning. With the main (only?) common thing being "loss" and "cost" functions being something that want wants to minimise, and objective function being something one wants to optimise (which can be both maximisation or minimisation).

Related Question