[Tex/LaTex] How to merge data and create multiple documents?, similar to personalized mailings

automationcsvdatabaseexcelpython

I would like to use LATEX combined with EXCEL data files (or CSV if it were neccessary) to create "proof of payment".

Supose a list of people

John McEnroe, NYC (USA), 110.23$, 10$,May……
Mary Rose, London, 567.89$, 50$,May………
…………………………………..

and what I would like to obtain is a paper similar to
enter image description here
where you can view the fields of each record of the database.

The question is that I can´t create this.

Could you recommend me any packages or tips to create multiple of this?

Thank you.

I know that it´s not a concret question. It´s a call of SOS asking for help.

Best Answer

As the mightiest tool in my toolbox is python, I see python problems everywhere. (The "my only tool is a hammer, everything looks like a nail problem".)

The idea is the following:

  • A data file, either excel, csv, mysql-database or whatever you like
  • A LaTeX-template with placeholders, I used @@key, as two @ should never occure in a document, but you can think of your own.
  • A python script that fills the placeholders for each row in the data an calls LaTeX to produce the result.

Apart from the standard library you only need pandas for the data munching part. If you want to learn python in the next time and have a scientific background, which i assume if you are using mathematica, i recommend to install python using the anaconda distribution. It comes bundled with nearly all scientific modules und prebuilt dependencies:

http://continuum.io/downloads#py34

This is my data.csv:

productID,firstname,lastname,date
1,Jules,Winnfield,2015-01-01
2,Vincent,Vega,2015-01-02

This is my really simplistic template.tex:

\documentclass{scrartcl}

\usepackage{fontspec}

\begin{document}

\begin{itemize}
  \item @@productID
  \item @@firstname
  \item @@lastname
\end{itemize}

\end{document}

And this is the python script, please ask if something is not clear:

# codecs provides input/output with specific encoding
import codecs
# os includes operating system operations
import os
# this is needed to call latex from python
from subprocess import call
# pandas for data munching
import pandas

# create folders, no error if it already exists
os.makedirs('tmp', exist_ok=True)
os.makedirs('output', exist_ok=True)

# read in the template:
with codecs.open('template.tex', encoding='utf8') as f:
    template = f.read()

data = pandas.read_csv('data.csv')
# show the first 5 rows in the data to have a quick look
print(data.head())

# these are the keys we want to replace with our data:
keys = [
    'productID',
    'firstname',
    'lastname',
]

# no we loop over each row to create a pdf with the
# data
for index, row in data.iterrows():
    filled = template
    for key in keys:
        # replace our placeholder with the actual data, cast to string first
        filled = filled.replace('@@' + key, str(row[key]))

    # create a hopefully unique filename
    filename = 'filled_{}_{}_{}'.format(
        row.lastname,
        row.firstname,
        row.date,
    )
    # now we write the filled template to the tmp folder
    with codecs.open('tmp/' + filename + '.tex', 'w', encoding='utf8') as f:
        f.write(filled)

    # and call lualatex or any other latex compiler
    # call takes a list of arguments
    call(['lualatex',
          '--interaction=batchmode',
          '--output-directory=tmp',
          'tmp/' + filename + '.tex',
          ])

    # there is a missing newline at the end of the latex call
    print('\n')

    # now move the file to the output folder:
    os.rename('tmp/' + filename + '.pdf', 'output/' + filename + '.pdf')

# now we delete the tmp folder
call(['rm', '-rf', 'tmp'])

pandas also provides read_excel, read_sql_table, read_sql_query and many more: http://pandas.pydata.org/pandas-docs/stable/io.html