I am trying to convert a .tex file into a .txt file, so that it could be directly copy pasted into environments that supports only MathJax. For example blogs, Mathematics Stackexchange, stackedit.io etc.
But I am having problem with user defined enviornments like theorem, definition etc.
\begin{proof}
Example
\end{proof}
In a latex editor, the pdf would be rendered as
Proof . Example
But converting it to .txt using the code pandoc -o output.txt input.tex
the output is rendered as
Example
It is missing the headings. Similarly other-user defined environments also miss their respective headings.
Is there some way to make Pandoc add the word "Proof" or heading corresponding to an environment at the beginning?
Best Answer
Short answer: no.
Long answer:
The only ways an automated script can know that the output should contain the word "Proof" are:
1) This knowledge is hardcoded in the script. It knows about the meaning of some latex commands and environments (the via taken by pandoc)
2) It can run tex code and get the output (via taken by t4ht, for example)
The first approach is not flexible enough, since you can load packages which are not known by the script, and which define commands that will be ignored (in addition, your document can define your own commands too).
The second approach can be done via
pdflatex
followed by some "pdf to text" converter, or vialatex
followed bydvi2tty
, or viatex4ht
. In any case, it loses the original tex markup, and then is not appropiate if you want to keep the "code" of the math formulae.Let's see an example. Consider the following document:
Running it through standard
pdflatex
you get:If you run it through
pandoc
, you get the following.txt
:in which you lost the word "Proof", and the final end-of-proof mark, but it keeps the formula markup.
If you run it through
pdflatex
and thenpdftotxt
you get:which keeps the word "Proof", but completly messes the formula
If you run it through
latex
and thendvi2tty
, you get:Which is closer to the pdf output, but still loses the formula markup.
If you run it through
tex4ht
you get an HTML version of the document, which can be in turn processed bypandoc
to get the following.txt
:As you can see, none of the solutions is satisfactory.