[Tex/LaTex] Automatic document generation based on a database

automationpython

I want to automatically create business card based on a database containing my employees' data. (i.e. to generate one .pdf file per person, filling in a pre-filled document with the information of the database).

I have a database (a .csv file) that looks like this:

Employee ID, Last Name, First name, Telephone number, Additional informations
001, Dylan, Bob, 012345, Some stuff
007, Doe, John, 01452, 
002, Doe, Jane, , A lot of \emph{informations} with \LaTeX commands that are very long \newline and maybe should be displayed on several lines etc. \begin{itemize} \item fus \item ro \item dah! \end{itemize}

My .tex file that defines the look of the PDF is

\documentclass{scrartcl}
\begin{document}
    \includegraphics[width=1cm,height=3cm]{#EmplyeeID.jpg}
    {\sffamily #FirstName \textsc{#LastName}}
    \newline
    \section{Phone}
    {\tiny #telephone number}
    \newpage
    #AdditionalInformations
\end{document}

I want the LaTeX file to read the second row of the database, replace the #Fields by their values, generate a PDF and then start again with a new line – as long as necessary.

I looked at the datatool package, I am able to load the database (\DTLloaddb[autokeys]{DB}{my_db.csv}). But I'm going blank regarding two points:

  1. How to read only the n-th element of the m-th row?
  2. How to generate a PDF and repeat with m:=m+1?

Thanks for any help (fixing these issues or indicating packages I should look at)!

Best Answer

Following Jim's suggestion, I wrote down an Python-code that does what I wanted to do.

Even if it is more Python oriented, I think it might be interesting to post the answer here. (Since we're TeXer here, and maybe not Python-ers, there are 'for dummies' explanations - sorry for length.)


Needed files

You need following files in your working directory :

  • A python-file named routine.py containing the code herein below,
  • A csv-database named database.csv, with following structure:

    Employee ID, Last Name, Fist name, Telephone number, Additional informations
    001, Dylan, Bob, 012345, Some stuff
    007, Doe, John, 01452,
    002, Doe, Jane, , "A lot of \emph{informations} with \LaTeX commands, and commas as well. \newline It can be displayed on several lines. \begin{itemize} \item fus \item ro \item dah! \end{itemize}"
    

(Note that there is a header - the python routine will then skip the first line)

  • One picture per employee. This picture should be named according to following pattern : EmployeeID.extention (E.g. in this example, you need three pictures : 001.jpg, 002.png, 007.jpg - note that the format might not be the same over the pictures)

The Python code

Considering Dirk's suggestion, you have two options : either you generate one PDF-file per business card, or you generate one PDF-file with all the business card. You'll find both versions of the Python-routine herein below.

Please look at embedded comments for explanations.

  • To create one PDF-file per business card

The global operating is following :

  1. the routine reads a row of the .csv database,
  2. generates a customized LaTeX code,
  3. and compiles this.
  4. Then, it repeats the process (steps 1 to 3) for each row.
###== imported packages ==###
import csv
import subprocess # cf "http://stackoverflow.com/questions/19683123/
                  # compile-latex-from-python" for original example

###== Definition of the LaTeX template (with "blanks") ==###
    # caution : you need to escape backslashs with backslashs
    #   blanks are filed with %(Name)s 
    # Name are here two letters
    # 's' means the variable is a string
LatexContent = '''\\documentclass{scrartcl}
                        \\usepackage{graphicx}
                        \\begin{document}
                            \\includegraphics[width=1cm,height=3cm]{%(Id)s}
                            {\\sffamily %(Fn)s \\textsc{%(Ln)s}}
                                \\newline
                            \\section{Phone}
                            {\\tiny Phone number: %(Ph)s}
                                \\newpage
                            %(Ot)s
                   \\end{document}'''

###== Look at the database ==##
# open the database into python
my_db_file = open("database.csv","rb")

# read the database
my_db = csv.reader(my_db_file, delimiter=',',skipinitialspace=True)

###== TeX files processing and generating ==###
#skip the header of the database
my_db.next() 

#then for each row of the database
for row in my_db :
        ## Assign the items of the row to the variables that will fill up the 
        ##    blanks of the LaTeX code
        ID = str(row[0])            #caution, first item of a row = index '0'
        LastName = str(row[1])
        FirstName = str(row[2])
        Phone = str(row[3])
        Other = str(row[4])

            #define the TeX file name
        TexFileName = ID + '.tex'

        ## create a new LaTeX file with the blanks filled
            #create a new file
        TexFile = open(TexFileName,'w')

            #fill the blanks with the previously read informations
        TexFile.write(LatexContent %{"Id" : ID, "Fn" : FirstName, 
        "Ln" : LastName, "Ph" : Phone, "Ot" : Other })

            #close the file
        TexFile.close()

        ## compile the file you've just created with LaTeX        
        subprocess.Popen(['pdflatex',TexFileName],shell=False)      

        ##repeate for each row

#close the database file
my_db_file.close()
  • To create only one PDF with all the business cards

    The global operating is here slightly different :

    1. the routine generates the head of the .tex file,
    2. reads a row of the .csv database,
    3. generates a customized TeX code, and append it to the existing .tex file,
    4. repeats steps 2 and 3 for each row
    5. appends the bottom of the TeX code to the existing .tex file,
    6. finally compiles the .tex file.
###== imported packages ==###
import csv
import subprocess # cf "http://stackoverflow.com/questions/19683123/
                  # compile-latex-from-python" for original example

###== Definition of the LaTeX template (with "blanks") ==###
    # caution : you need to escape backslashs with backslashs
    #   blanks are filed with %(Name)s 
    # Name are here two letters
    # 's' means the variable is a string

LaTeXpreamble='''\\documentclass{scrartcl}
                        \\usepackage{graphicx}
                        \\begin{document}'''

LaTeXcodePerBusinessCard='''\\includegraphics[width=1cm,height=3cm]{%(Id)s}
                            {\\sffamily %(Fn)s \\textsc{%(Ln)s}}
                                \\newline
                            \\section{Phone}
                            {\\tiny Phone number: %(Ph)s}
                                \\newpage
                            %(Ot)s
                            \\null'''
LaTeXinBetween='''\\newpage'''

LaTeXcolophon='''\\end{document}'''

###== Look at the database ==##
# open the database into python
my_db_file = open("database.csv","rb")

# read the database
my_db = csv.reader(my_db_file, delimiter=',',skipinitialspace=True)

###== TeX files processing and generating ==###
#skip the header of the database
my_db.next()

#create a new textfile
    #define its name
TexFileName = 'businessCards.tex' 

    #create the file
TexFile = open(TexFileName,'w')

    #copy the preamble
TexFile.write(LaTeXpreamble)

    #close the file and re-open it in 'append' mode
TexFile.close()
TexFile = open(TexFileName,'a')

#then for each row of the database
for row in my_db :
        ## Assign the items of the row to the variables that will fill up the 
        ##    blanks of the LaTeX code
        ID = str(row[0])            #caution, first item of a row = index '0'
        LastName = str(row[1])
        FirstName = str(row[2])
        Phone = str(row[3])
        Other = str(row[4])


        ## add a piece of code with the blanks filled

            #fill the blanks with the previously read informations
        LaTeXcodeToAdd = LaTeXcodePerBusinessCard %{"Id" : ID, 
        "Fn" : FirstName, "Ln" : LastName, "Ph" : Phone, "Ot" : Other }

            #append this code to the .tex file
        TexFile.write(LaTeXcodeToAdd)

            #append the 'in-between' code to separate two business cards
        TexFile.write(LaTeXinBetween)

        ##repeate for each row

#append the colophon to finish the .tex filed
TexFile.write(LaTeXcolophon)

#close the file
TexFile.close()

## compile the .tex file with pdfLaTeX        
subprocess.Popen(['pdflatex',TexFileName],shell=False)   

#close the database file
my_db_file.close()

How to process?

  1. Open your terminal
  2. Browse to the directory containing aforementioned files
  3. Execute the command python routine.py

That's all !