I was impressed by the results in the ICML 2014 paper "Distributed Representations of Sentences and Documents" by Le and Mikolov. The technique they describe, called "paragraph vectors", learns unsupervised representations of arbitrarily-long paragraphs/documents, based on an extension of the word2vec model. The paper reports state-of-the-art performance on sentiment analysis using this technique.
I was hoping to evaluate this technique on other text classification problems, as an alternative to the traditional bag-of-words representation. However, I ran across a post by the second author in a thread in the word2vec Google group that gave me pause:
I tried myself to reproduce Quoc's results during the summer; I could get error rates on the IMDB dataset to around 9.4% – 10% (depending on how good
the text normalization was). However, I could not get anywhere close
to what Quoc reported in the paper (7.4% error, that's a huge
difference) … Of course we also asked Quoc about the code; he
promised to publish it but so far nothing has happened. … I am starting
to think that Quoc's results are actually not reproducible.
Has anyone had success reproducing these results yet?
Best Answer
Footnote at http://arxiv.org/abs/1412.5335 (one of the authors is Tomas Mikolov) says