LDA – Natural Interpretation for LDA Hyperparameters

hyperparameterinterpretationpriortopic-models

Can somebody explain what is the natural interpretation for LDA hyperparameters? ALPHA and BETA are parameters of Dirichlet distributions for (per document) topic and (per topic) word distributions respectively. However can someone explain what it means to choose larger values of these hyperparameters versus smaller values? Does that mean putting any prior beliefs in terms of topic sparsity in documents and mutual exclusiveness of topics in terms of words?

This question is about latent Dirichlet allocation, but the comment by BGReene immediately below refers to linear discriminant analysis, which confusingly is also abbreviated LDA.

Best Answer

David Blei has a great talk introducing LDA to students of a summer class: http://videolectures.net/mlss09uk_blei_tm/

In the first video he covers extensively the basic idea of topic modelling and how Dirichlet distribution come into play. The plate notation is explained as if all hidden variables are observed to show the dependencies. Basically topics are distributions over words and document distributions over topics.

In the second video he shows the effect of alpha with some sample graphs. The smaller alpha the more sparse the distribution. Also, he introduces some inference approaches.