Which loss function does the maximum likelihood estimator minimize

bayesianestimationparameter estimation

I'm trying to understand Maximum Likelihood estimators in the context of general estimation theory. I know Bayesian estimator minimizes mean squared loss, MAP estimator minimizes all-or-nothing loss (loss is zero if the estimator estimates the correct parameter and 1 otherwise). Which loss function does the maximum likelihood function minimize?

My thought was that it is negative of the log-liklihood function but the definition of the loss function includes an estimator $T(X)$ and parameter $s$. As I see it, the negative of the log-likelihood function does not have any estimator in it.

Best Answer

Kullback-Leibler divergence (between the empirical and theoretical probability distributions) is the loss function minimized by the MLE, at least according to this derivation, which looks legitimate on first glance.

Related Question