Can LDA be used to detect the topic of A SINGLE document?
Yes, in its particular representation of 'topic,' and given a training corpus of (usually related) documents.
LDA represents topics as distributions over words, and documents as distributions over topics. That is, one very purpose of LDA is to arrive at probabilistic representation of each document as a set of topics. For example, the LDA implementation in gensim
can return this representation for any given document.
But this depends on the other documents in the corpus: Any given document will have a different representation if analyzed as part of a different corpus.
That's not typically considered a shortcoming: Most applications of LDA focus on related documents. The paper introducing LDA applies it to two corpora, one of Associated Press articles and one of scientific article abstracts. Edwin Chen's nicely approachable blog post applies LDA to a tranche of emails from Sarah Palin's time as Alaska governor.
If your application demands separating documents into known, mutually exclusive classes, then LDA-derived topics can be used as features for classification. Indeed, the initial paper does just that with the AP corpus, with good results.
Relatedly, Chen's demonstration doesn't sort documents into exclusive classes, but his documents' mostly concentrate their probability on single LDA topics. As David Blei explains in this video lecture, the Dirichlet priors can be chosen to favor sparsity. More simply, "a document is penalized for using many topics," as his slides put it. This seems the closest LDA can get to a single, unsupervised topic, but certainly doesn't guarantee every document will be represented as such.
It is important to remember that topic models such as LDA were primarily developed for unsupervised text summarization. So often, there is not a "best" choice for how many top words to show. Most research papers on topic models tend to use the top 5-20 words. If you use more than 20 words, then you start to defeat the purpose of succinctly summarizing the text.
A tolerance $\epsilon > 0.01$ is far too low for showing which words pertain to each topic. A primary purpose of LDA is to group words such that the topic words in each topic are highly probable within that topic. If such a low threshold is chosen, then many, many words will appear in each topic, again defeating the purpose of succinct text summarization. To extract the most probable words, you would be better off choosing a threshold of $\epsilon > 0.9$ or maybe $\epsilon > 0.8$.
The issue of seeing wordless topics in general when using Gensim is probably because Gensim has its own tolerance parameter "minimum_probability". This parameter defaults to 0.01 (this is explained in the Gensim LDA documentation). If you want to see all the words per topic, regardless of their low probability of appearing in the topic, you can set minimum_probability = 0.
For LDA, you are best off using the normalized probabilities (using "get_topic_terms" function through the ldamodel) because they are the most interpretable. I am not intimately familiar with how Gensim estimates the topic-word probabilities, but the unnormalized values are probably a result of Bayesian estimation where it's not relevant to directly estimate the denominator because (as you've said) it's just normalization.
Best Answer
(Pseudo-code) Computing similarity between two documents (doc1, doc2) using existing LDA model:
In the first step, you simply apply your LDA model on the two input documents, getting back a vector for each document. The vector represents the topic distribution for the document.
The second step is to apply a similarity measure of your choice to compare the two vectors. You should experiment with different types of similarity measures to see which one works best in your case. Some good options to consider for distance metrics are cosine distance and Hellinger distance. Note that the underlying assumption here is that we consider two documents to be similar if their presumed topics are similar.
Example using Cosine similarity: