[Tex/LaTex] Extracting the contents of text in a specified environment into a new file

environmentstext manipulation

Imagine I have a complete latex file and I want to extract only the text that appears in a specified environment (i.e., within a custom hypothesis environment)

For example:

\begin{document}
...
Lots of stuff that I don't want extracted
\begin{hypothesis}
 The content that I want to extract
\end{hypothesis}
...
Lots more stuff I don't want to exract
...
\begin{hypothesis}
 Some more content that I want to extract
\end{hypothesis}
\end{document}

The question: What's a simple way of taking a complete latex source file and extracting just the text in a specified environment and saving it to a new text file?

Although I'm not an expert I've heard a lot of people talk about Perl scripts being good for string manipulation. I also sometimes use regular expressions. Thus, in addition to a specific solution to the above problem, I would be interested to hear about general approaches to related LaTeX text manipulation tasks.

Update: Copying and pasting is not a desired option because the environment occurs over 20 times in a 20,000 word document.

Best Answer

As well as the options provided by Ulrike, you might also be interested in the extract package, which was written for exactly the problem you're looking to solve.

To export all text within the hypothesis environment to a file called filename, add the following code to the preamble:

\usepackage[active, generate=filename, extract-env={hypothesis}]{extract}