[Tex/LaTex] Arbitrary text parsing from a separate file

compilingmacrosparsingtex-core

I'm still very new to the world of TeX, so please bear with me if this question is unclear or has been asked before. I'm starting to figure my way through TeX and the millions of plugins and extensions for it.

As a side note: had I known about TeX when I was studying for my CompSci degree, it would have made writing my dissertation so much easier – I like it a lot better than other document creation and word processing packages.

I'm building a TeX document, I'd like all of the formatting information to be stored on the main document and all of my text in separate files, but I'd also like to have some arbitrary parsing performed when TeX finds the given text entries.

As an example, if I'm working on implementing a simple resume using this as an example (mainly because of it's clear section separation design), would TeX be able to pull the text for each section from a file, parse it so that it fits the formatting and display only N number entries?

In the example file, the author creates a command called \EducationEntry with the following code:

\newcommand{\EducationEntry}[4]{ 
\noindent \textbf{#1} \hfill 
\colorbox{Black}{% 
    \parbox{6em}{% 
    \hfill\color{White}#2}} \par 
\noindent \textit{#3} \par 
\noindent\hangindent=2em\hangafter=0 \small #4 
\normalsize \par}

So, is it possible to have a text file that reads something like:

BA Some course or other | 2502-2504 | University of Mars | Details
MA More courses | 2504-2506 | University of The Moon | Details

such that TeX can read each line, separate on a given character (in this case '|', but can be anything), and create an \EducationEntry for each line?

I'm thinking of implementing a very similar structure for a user guide that I'm working on, and a few translation projects too. Is this can be done, and it's not too complicated for a newer user like me, then it would make consistent formatting a lot easier. I'm still only in the planning/design stages of the overall look for the document at the minute. However, I have all of the text ready to go.

If it helps, I'm using TexMaker 3.5 on both MS Windows and Ubuntu 12.04 (but I doubt that will make a huge difference) and I'm outputting directly to PDF via PS.

Best Answer

I agree with the comments saying that preprocessing the data files is better. However, as an exercise in expl3 programming, here's a way:

\begin{filecontents*}{\jobname.ls1}
BA Some course or other | 2502-2504 | University of Mars | Details
MA More courses | 2504-2506 | University of The Moon | Details
\end{filecontents*}

\begin{filecontents*}{\jobname.ls2}
BA Some course or other + 2502-2504 + University of Mars + Details
MA More courses + 2504-2506 + University of The Moon + Details
\end{filecontents*}

\documentclass{article}

\usepackage{xparse,xcolor}
\ExplSyntaxOn
\NewDocumentCommand{\readdata}{O{|} m}
 {
  \taylor_readdata:nn { #1 } { #2 }
 }

\seq_new:N \l__taylor_line_seq
\ior_new:N \g__taylor_read_ior

\cs_new_protected:Npn \taylor_readdata:nn #1 #2
 {
  \ior_open:Nn \g__taylor_read_ior { #2 }
  \ior_map_inline:Nn \g__taylor_read_ior
   { \__taylor_doline:nn { #1 } { ##1 } }
  \ior_close:N \g__taylor_read_ior 
 }
\cs_new_protected:Npn \__taylor_doline:nn #1 #2
 {
  \seq_set_split:Nnn \l__taylor_line_seq { #1 } { #2 }
  \use:x
   { 
    \exp_not:N \EducationEntry
    \seq_map_function:NN \l__taylor_line_seq \__taylor_brace:n
   }
 }
\cs_new:Npn \__taylor_brace:n #1 { { #1 } }
\ExplSyntaxOff

\newcommand{\EducationEntry}[4]{ 
\noindent \textbf{#1} \hfill 
\colorbox{black}{% 
    \parbox{6em}{% 
    \hfill\color{white}#2}} \par 
\noindent \textit{#3} \par 
\noindent\hangindent=2em\hangafter=0 \small #4 
\normalsize \par}

\begin{document}

\readdata{\jobname.ls1}

\bigskip

\readdata[+]{\jobname.ls2}

\end{document}

The column separator can be specified as optional argument to \readdata.

Related Question