# [Tex/LaTex] Extracting the structure of a LaTex document, including comments

documentation

I like to write a "special" comment with a title at the beginning of each paragraph in my documents, something like this:

\section{Introduction}

% 1.0 Visually tracking balls is an interesting problem.
Text related to why tracking balls is important, blah, blah, blah.
Blah, blah, blah.

% 1.1 Tracking balls is difficult because of these challenges.
Text relating the challenges, blah, blah, blah.
More blah, blah, blah.

% 2.0 Previous work.
Text relating the existing work on tracking balls, blah, blah, blah.
Blah, blah, blah.


This helps me keeping the content of my documents well structured.

What I'd like to do is to extract the structure of a document: sectioning commands (i.e, \chapter, \section and the like) and "special" comments at the beginning of each paragraph.
I'd like to parse a LaTex source file (or a group of them, in case the main file includes other source files) and produce a new file which contains only the sectioning commands and the "special" comments, turned into regular text (uncommented) or, better yet, turned into a bulleted list.

So the output generated running the parser on the previous code would be:

\section{Introduction}

\begin{itemize}
\item 1.0 Visually tracking balls is an interesting problem.
\item 1.1 Tracking balls is difficult because of these challenges.
\item 2.0 Previous work.
\end{itemize}


The best solution I got so far consists in me labelling the special comments with a character sequence (i.e., starting them with "%$"), then grepping the occurrences of "%$" and of the sectioning commands in the source file.

This is easily done with any command line tools here I use grep and sed (in cygwin bash on windows) but other systems have similar or you could use perl

If zz.tex is your original file,

The command line of

$grep "$$sub$$*section\|^%" zz.tex | sed -e '/^\\s/ a \\\\begin{itemize}' -e 's/^%/\\item /' -e '$ a \\\\end{itemize}'


outputs

\section{Introduction}
\begin{itemize}
\item  1.0 Visually tracking balls is an interesting problem.
\item  1.1 Tracking balls is difficult because of these challenges.
\item  2.0 Previous work.
\end{itemize}