Solved – Why discriminative models are preferred to generative models for sequence labeling tasks

hidden markov modelmachine learningnatural languagepredictive-models

I understand that discriminative models, such as CRF(Conditional Random Fields), model conditional probabilities $P(y|x)$, while generative models, such as HMM(Hidden Markov Model), model joint probabilities $P(y,x)$.

Take CRF and HMM for example. I know that CRF can have a larger range of possible features. Apart from that, what else makes CRF (discriminative models) preferable to HMM(generative models) in sequence labeling tasks such as Part-of-Speech tagging and NER(Name Entity Recognition)?

Edit:
I found out that HMMs will have to model $P(x)$, while CRFs don't. Why would it make a big difference in sequence labeling tasks?

Best Answer

I think you pretty much nailed it in your Edit. Generative model makes more restrictive assumption about the distribution of $x$.

From Minka

"Unlike traditional generative random fields, CRFs only model the conditional distribution $p(t|x)$ and do not explicitly model the marginal $p(x)$. Note that the labels ${ti }$ are globally conditioned on the whole observation $x$ in CRFs. Thus, we do not assume that the observed data $x$ are conditionally independent as in a generative random field."