Solved – If you use a point estimate that maximizes $P(x | \theta)$, what does that say about your philosophy? (frequentist or Bayesian or something else?)

bayesianfrequentistlikelihoodmaximum likelihoodphilosophical

If somebody said

"That method uses the MLE the point estimate for the parameter which maximizes $\mathrm{P}(x|\theta)$, therefore it is frequentist; and further it is not Bayesian."

would you agree?

  • Update on the background: I recently read a paper that claims to be frequentist. I don't agree with their claim, at best I feel it's ambiguous. The paper does not explicitly mention either the MLE (or the MAP, for that matter). They just take a point estimate, and they simply proceed as if this point estimate was true. They do not do any analysis of the sampling distribution of this estimator, or anything like that; the model is quite complex and therefore such analysis is probably not possible. They do not use the word 'posterior' at any point either. They just take this point estimate at face value and proceed to their main topic of interest – inferring missing data.
    I don't think there is anything in their approach which suggests what their philosophy is. They may have intended to be frequentist (because they feel obliged to wear their philosophy on their sleeve), but their actual approach is quite simple/convenient/lazy/ambiguous.
    I'm inclined now to say that the research doesn't really have any philosophy behind it; instead I think their attitude was more pragmatic or convenient:

    "I have observed data, $x$, and I wish to estimate some missing data, $z$. There is a parameter $\theta$ which controls the relationship between $z$ and $x$. I don't really care about $\theta$ except as a means to an end. If I have an estimate for $\theta$ it will make it easier to predict $z$ from $x$. I will choose a point estimate of $\theta$ because it's convenient, in particular I will choose the $\hat{\theta}$ that maximizes $\mathrm{P}(x|\theta)$."

The idea of an unbiased estimator is clearly a Frequentist concept. This is because it doesn't condition on the data, and it describes a nice property (unbiasedness) which would hold for all values of the parameter.

In Bayesian methods, the roles of the data and the parameter are sort of reversed. In particular, we now condition on the observed data and proceed to make inferences about the value of the parameter. This requires a prior.

So far so good, but where does the MLE (Maximum Likelihood Estimate) fit into all this? I get the impression that many people feel that it is Frequentist (or more precisely, that it is not Bayesian). But I feel that it is Bayesian because it involves taking the observed data and then finding the parameter which maximizes $P(data | parameter)$. The MLE is implicitly using a uniform prior and conditioning on the data and maximizing $P(parameter | data)$. Is it fair to say that the MLE looks both Frequentist and Bayesian? Or does every simple tool have to fall into exactly one of those two categories?

The MLE is consistent but I feel that consistency can be presented as a Bayesian idea. Given arbitrarily large samples, the estimate converges on the correct answer. The statement "the estimate will be equal to the true value" holds true for all values of the parameter. The interesting thing is that this statement also holds true if you condition on the observed data, making it Bayesian. This interesting aside holds for the MLE, but not for an unbiased estimator.

This is why I feel that the MLE is the 'most Bayesian' of the methods that might be described as Frequentist.

Anyway, most Frequentist properties (such as unbiasedness) apply in all cases, including finite sample sizes. The fact that consistency only holds in the impossible scenario (infinite sample within one experiment) suggests that consistency isn't such a useful property.

Given a realistic (i.e. finite) sample, is there a Frequentist property that holds true of the MLE? If not, the MLE isn't really Frequentist.

Best Answer

Or does every simple tool have to fall into exactly one of those two categories?

No. Simple (and not so simple tools) can be studied from many different viewpoints. The likelihood function by itself is a cornerstone in both Bayesian and frequentist statistics, and can be studied from both points of view! If you want, you can study the MLE as an approximate Bayes solution, or you can study its properties with asymptotic theory, in a frequentist way.

Related Question