# [Tex/LaTex] LaTeX to plain text for e.g. generation of statistics

conversiontext-only

I would like to convert a large LaTeX project (i.e. spanning multiple files) into plain text. The purpose is generation of statistics, so representing mathematics is not an issue. In fact, all mathematics is ideally ignored.

I have found http://code.google.com/p/textricks/ but could not get it to run. It seems unfinished, but is exactly what I am looking for otherwise.

I would compile the document into a PDF and then use pdftotext to convert it to a text file. You should disable all hyphenation and remove the page header (\pagestyle{empty}) to get only the raw text. This ensures that you are using the LaTeX output not the input which might differ.