Solved – Word-topic matrix in Latent Dirichlet Allocation

latent-variablemachine learningtopic-modelsunsupervised learning

In the latent Dirichlet allocation model described in Wikipedia, is $\beta$ the word-topic matrix?

I understand that $\beta$ is the topic-word matrix and that $\beta_{ij}$ contains the probability of word $i$ given topic $j$, but I would like to confirm it.

Best Answer

Confusingly, the variables in Wikipedia's description of smoothed LDA don't follow the paper introducing LDA. In the paper, $\beta$ is first described exactly as you've described it:

...the word probabilities are parameterized by a $k ×V$ matrix $\beta$ where $\beta_{ij} = p(w_j = 1|z_i = 1)$, which for now we treat as a fixed quantity that is to be estimated.

The authors later introduce smoothed LDA, in which each row of $\beta$ is drawn from an exchangeable Dirichlet with prior parameter $\eta$. Wikipedia presently uses $\beta$ where the paper uses $\eta$, and $\varphi$ where the paper uses $\beta$.

Here's the plate notation from the paper:

enter image description here

And from wiki:

enter image description here

I'm not sure which is more common in implementations. For example, scikit-learn uses $\eta$ for the prior.

Related Question