Unbiasedness isn't necessarily especially important on its own.
Aside a very limited set of circumstances, most useful estimators are biased, however they're obtained.
If two estimators have the same variance, one can readily mount an argument for preferring an unbiased one to a biased one, but that's an unusual situation to be in (that is, you may reasonably prefer unbiasedness, ceteris paribus -- but those pesky ceteris are almost never paribus).
More typically, if you want unbiasedness you'll be adding some variance to get it, and then the question would be why would you do that?
Bias is how far the expected value of my estimator will be too high on average (with negative bias indicating too low).
When I'm considering a small sample estimator, I don't really care about that. I'm usually more interested in how far wrong my estimator will be in this instance - my typical distance from right... something like a root-mean-square error or a mean absolute error would make more sense.
So if you like low variance and low bias, asking for say a minimum mean square error estimator would make sense; these are very rarely unbiased.
Bias and unbiasedness is a useful notion to be aware of, but it's not an especially useful property to seek unless you're only comparing estimators with the same variance.
ML estimators tend to be low-variance; they're usually not minimum MSE, but they often have lower MSE than than modifying them to be unbiased (when you can do it at all) would give you.
As an example, consider estimating variance when sampling from a normal distribution $\hat{\sigma}^2_\text{MMSE} = \frac{S^2}{n+1}, \hat{\sigma}^2_\text{MLE} = \frac{S^2}{n}, \hat{\sigma}^2_\text{Unb} = \frac{S^2}{n-1}$ (indeed the MMSE for the variance always has a larger denominator than $n-1$).
A general answer is that an estimator based on a method of moments is not invariant by a bijective change of parameterisation, while a maximum likelihood estimator is invariant. Therefore, they almost never coincide. (Almost never across all possible transforms.)
Furthermore, as stated in the question, there are many MoM estimators. An infinity of them, actually. But they are all based on the empirical distribution, $\hat{F}$, which may be seen as a non-parametric MLE of $F$, although this does not relate to the question.
Actually, a more appropriate way to frame the question would be to ask when a moment estimator is sufficient, but this forces the distribution of the data to be from an exponential family, by the Pitman-Koopman lemma, a case when the answer is already known.
Note: In the Laplace distribution, when the mean is known, the problem is equivalent to observing the absolute values, which are then exponential variates and part of an exponential family.
Best Answer
Are you interested in prediction or inference? If you actually know the distribution (which in practice you never do, except for binary data), there are classical results that show you can't really beat the MLE if your sample size is reasonable. With small sample sizes, penalized likelihoods can do well for prediction, such as the Elastic Net.
Also, you don't have to be a Bayesian in order to use Bayesian methods. All Bayesian methods have frequentist properties (confidence intervals, p-values and the like), but they can be difficult to compute. Frank Samaniego, from ucdavis, has a lot of nice theoretical results on this issue.