Solved – Limitation of LDA (latent dirichlet allocation)

topic-models

I'd like to get a list of limitations of LDA. I know that LDA does not work for short document set like a set of tweets very well.

Are there such known limitations of LDA? Some reference including a list of the limitations is preferable.

Best Answer

Common LDA limitations:

  • Fixed K (the number of topics is fixed and must be known ahead of time)
  • Uncorrelated topics (Dirichlet topic distribution cannot capture correlations)
  • Non-hierarchical (in data-limited regimes hierarchical models allow sharing of data)
  • Static (no evolution of topics over time)
  • Bag of words (assumes words are exchangeable, sentence structure is not modeled)
  • Unsupervised (sometimes weak supervision is desirable, e.g. in sentiment analysis)

A number of these limitations have been addressed in papers that followed the original LDA work. Despite its limitations, LDA is central to topic modeling and has really revolutionized the field.