Merge comma separated lists from Datatool

datatoolexpl3xparse

I have tried solving this for some time now but can´t seem to find the solution. I have a document that loads data from a CSV-file using Datatool and I have been trying to combine comma separated information from two cells in the CSV-file. The file itself uses ";" as separator and the comma separated information in a single cell uses "," as separator. I have a MWE below to demonstrate what I try to do. I have two problems, the first can be seen with the example using a,b,c and 1,2,3 where I get a "," where I would like to only have a " " separating the items and only \newline separating the lines.

The other problem is that when I try to load the separated list using Datatool all is treated as one item.

The result I want is:
1.2.3 Some data
2.3.4 More data
3.4.5 Even more data

The output from the two examples right now are:

a1
,b2
,c3

1.2.3 Some data
2.3.4,3.4.5 More data,Even more data

This is going to be part of a much bigger document about 650 lines and about 30 packages so if you need a list of packages I use to avoid compability problems I can list them too. I use XeLaTeX for processing tool.

I have been looking at these solutions but still can´t get it right:
Pairing items from two comma-separated lists into a single list

How to iterate over a comma separated list passed through the datatool interface

\begin{filecontents*}{data.csv}
A;B
1.2.3;Some data
2.3.4,3.4.5;More data,Even more data
\end{filecontents*}

\documentclass{article}

\usepackage{expl3}
\usepackage{xparse}
\usepackage{datatool}

\DTLsetseparator{;}%

\ExplSyntaxOn

\clist_new:N \l_doc_tmpa_clist
\clist_new:N \l_doc_tmpb_clist
\seq_new:N \l_doc_tmpa_seq

\msg_new:nnn {doc} {difflen} {two~comma~separated~lists~have~different~length}


\cs_set:Npn \doc_pair_items:nnn #1#2#3 {
    \clist_set:Nn \l_doc_tmpa_clist {#2}
    \clist_set:Nn \l_doc_tmpb_clist {#3}
    \seq_clear:N \l_doc_tmpa_seq
    
    \int_compare:nNnF {\clist_count:N \l_doc_tmpa_clist} = {\clist_count:N \l_doc_tmpb_clist} {
        \msg_error:nn {doc} {difflen}
    }
    
    \int_step_inline:nn {\clist_count:N \l_doc_tmpa_clist} {
        \seq_put_right:Nn \l_doc_tmpa_seq {
            \clist_item:Nn \l_doc_tmpa_clist {##1}
            #1
            \clist_item:Nn \l_doc_tmpb_clist {##1}\newline
        }
    }
    
    \seq_use:Nn \l_doc_tmpa_seq {,~}
}

\newcommand{\pairitems}[3][=]{
    \doc_pair_items:nnn {#1} {#2} {#3}
}

\ExplSyntaxOff


\begin{document}

\pairitems[ ]{a,b,c}{1,2,3}

\DTLloadrawdb{data}{data.csv}

\DTLforeach{data}
{\A=A,
\B=B}{
\pairitems[ ]{\A}{\B}
 }

\end{document}

Best Answer

Does this what you need?

\begin{filecontents*}{data.csv}
A;B
1.2.3;Some data
2.3.4,3.4.5;More data,Even more data
\end{filecontents*}

\documentclass[a4paper]{article}

% xparse/expl3 are included in more recent LaTeX2e-kernels
%\usepackage{expl3}
%\usepackage{xparse}
\usepackage{datatool}

\ExplSyntaxOn
\msg_new:nnn {MyStuff} {difflen} {two~comma~separated~lists~have~different~length}
\int_new:N \g_MyStuff_tempcntA
\int_new:N \g_MyStuff_tempcntB
\clist_new:N \g_MyStuff_clistA
\clist_new:N \g_MyStuff_clistB
\NewDocumentCommand\MergeDatabaseColumns{mmm}{
  \DTLforeach{#1}{\A=A,\B=B}{
     \int_gset:Nn \g_MyStuff_tempcntA {1}
     \int_gset:Nn \g_MyStuff_tempcntB {0}
     \clist_gset:No{\g_MyStuff_clistA}{\A}
     \clist_gset:No{\g_MyStuff_clistB}{\B}
     \int_compare:nNnTF {\clist_count:N \g_MyStuff_clistA} = {\clist_count:N \g_MyStuff_clistB} {
        \int_gset:Nn \g_MyStuff_tempcntB {\clist_count:N \g_MyStuff_clistA}
        \MyStuff_MergeListsloop:nn {#2}{#3}
     }{
        \msg_error:nn {MyStuff} {difflen}
     }
  }
}
\cs_new:Nn \MyStuff_MergeListsloop:nn {
  % #1 space/column-separator 
  % #2 \newline/row-separator
  \int_compare:nNnF {\g_MyStuff_tempcntB} < {\g_MyStuff_tempcntA} {
     \clist_item:Nn \g_MyStuff_clistA {\g_MyStuff_tempcntA} #1
     \clist_item:Nn \g_MyStuff_clistB {\g_MyStuff_tempcntA} 
     % Add the row-separator if not in last row of database.
     % In last row of database add row-separator only if not at the last items of comma lists.
     \DTLiflastrow{\int_compare:nNnF {\g_MyStuff_tempcntB} = {\g_MyStuff_tempcntA}{#2}}{#2}
     \int_gincr:N \g_MyStuff_tempcntA
     \MyStuff_MergeListsloop:nn{#1}{#2}
  }
}
\ExplSyntaxOff

\DTLsetseparator{;}%
\DTLloaddb{data}{data.csv}

\begin{document}

\noindent \MergeDatabaseColumns{data}{ }{\newline}

\end{document}

enter image description here


Another example, providing two a little bit more generic routines:

\MergeDatabaseCellCommaLists{<database>}
                            {<name of database-field holding comma-list 1>}
                            {<name of database-field holding comma-list 2>}
                            {<Macro which processes arguments><Macro's arguments except last but one and last argument>, 
                              the last but one argument being <item of comma-list 1>,
                              the last argument being <item of comma-list 2>,
                              In case the last list elements are reached, the switch 
                                 \ifLastListItem
                              is true.
                              In case the database's last row is reached, the switch 
                                 \ifLastRow
                              is true.
                              In case the database's last row's last list
                              elements are reached, the switch 
                                 \ifLastRowsLastListItem
                              is true.
                              The switches can be used in the definition and in the arguments
                              of <Macro which processes arguments>
                            }

and

\MergeCommaListsFromMacros{<Macro holding comma-list 1>; could come from a \DTLforeach's assignment-list}
                          {<Macro holding comma-list 2>; could come from a \DTLforeach's assignment-list}
                          {<Macro which processes arguments><Macro's arguments except last but one and last argument>, 
                            the last but one argument being <item of comma-list 1>,
                            the last argument being <item of comma-list 2>,
                            In case the last list elements are reached, the switch 
                               \ifLastListItem
                            is true.
                            In case of being inside a \DTLforeach-loop for iterating
                            rows of a database and the database's last row being
                            reached, the switch
                               \ifLastRow
                            is true.
                            In case of being inside a \DTLforeach-loop for iterating
                            rows of a database and the database's last row's last list
                            elements being reached, the switch 
                               \ifLastRowsLastListItem
                            is true.
                            The switches can be used in the definition and in the arguments
                            of <Macro which processes arguments>
                          }

  

\begin{filecontents*}{data.csv}
A;B
1.2.3;Some data
2.3.4,3.4.5;More data,Even more data
\end{filecontents*}


\documentclass[a4paper]{article}

% xparse/expl3 are included in more recent kernels
%\usepackage{expl3}
%\usepackage{xparse}
\usepackage{datatool}

\makeatletter
\@ifdefinable\savedDTLiflastrow{\let\savedDTLiflastrow\DTLiflastrow}
\newcommand\AtIfInDTLLoop{%
  \ifx\savedDTLiflastrow\DTLiflastrow
  \expandafter\@secondoftwo\else\expandafter\@firstoftwo\fi
}
\makeatother

\ExplSyntaxOn
\newif\ifLastRowsLastListItem \global\LastRowsLastListItemfalse
\newif\ifLastListItem \global\LastListItemfalse
\newif\ifLastRow \global\LastRowfalse
\msg_new:nnn {MyStuff} {difflen} {two~comma~separated~lists~have~different~length}
\int_new:N \_MyStuff_tempcntA
\int_new:N \_MyStuff_tempcntB
\clist_new:N \_MyStuff_clistA
\clist_new:N \_MyStuff_clistB
\cs_new:Nn \_MyStuff_pushbehind:nnn {#3#1{#2}}
\cs_generate_variant:Nn \_MyStuff_pushbehind:nnn {nVn}
\cs_new:Nn \_MyStuff_passtwoargs:nnn {#1{#2}{#3}}
\cs_generate_variant:Nn \_MyStuff_passtwoargs:nnn {nxx}

%\MergeDatabaseCellCommaLists{<database>}
%                            {<name of database-field holding comma-list 1>}
%                            {<name of database-field holding comma-list 2>}
%                            {<Macro which processes arguments><Macro's arguments except last but one and last argument>, 
%                              the last but one argument being <item of comma-list 1>,
%                              the last argument being <item of comma-list 2>,
%                              In case the last list elements are reached, the switch 
%                                 \ifLastListItem
%                              is true.
%                              In case the database's last row is reached, the switch 
%                                 \ifLastRow
%                              is true.
%                              In case the database's last row's last list
%                              elements are reached, the switch 
%                                 \ifLastRowsLastListItem
%                              is true.
%                              The switches can be used in the definition and in the arguments
%                              of <Macro which processes arguments>
%                            }
\NewDocumentCommand\MergeDatabaseCellCommaLists{mmmm}{
  \DTLforeach{#1}{\A=#2,\B=#3}{\MergeCommaListsFromMacros{\A}{\B}{#4}}
}
%\MergeCommaListsFromMacros{<Macro holding comma-list 1>; could come from a \DTLforeach's assignment-list}
%                          {<Macro holding comma-list 2>; could come from a \DTLforeach's assignment-list}
%                          {<Macro which processes arguments><Macro's arguments except last but one and last argument>, 
%                            the last but one argument being <item of comma-list 1>,
%                            the last argument being <item of comma-list 2>,
%                            In case the last list elements are reached, the switch 
%                               \ifLastListItem
%                            is true.
%                            In case of being inside a \DTLforeach-loop for iterating
%                            rows of a database and the database's last row being
%                            reached, the switch
%                               \ifLastRow
%                            is true.
%                            In case of being inside a \DTLforeach-loop for iterating
%                            rows of a database and the database's last row's last list
%                            elements being reached, the switch 
%                               \ifLastRowsLastListItem
%                            is true.
%                            The switches can be used in the definition and in the arguments
%                            of <Macro which processes arguments>
%                          }
\NewDocumentCommand\MergeCommaListsFromMacros{mmm}{
  %
  % \_MyStuff_pushbehind:nVn  and \use_ii_i:nn are used to reset scratch-comma-lists
  % \_MyStuff_clistA, \_MyStuff_clistB and \if-switches \ifLastListItem,
  % \ifLastRow, \ifLastRowsLastListItem to the vales they had before starting
  % the loop -- this way \MergeCommaListsFromMacros can be nested, i.e., the
  % <Macro which processes arguments> can do another call to
  % \MergeCommaListsFromMacros as long as you don't mess around with the
  % \globaldefs-parameter.
  %
  \_MyStuff_pushbehind:nVn {\clist_set:Nn{\_MyStuff_clistA}}{\_MyStuff_clistA}{
    \_MyStuff_pushbehind:nVn {\clist_set:Nn{\_MyStuff_clistB}}{\_MyStuff_clistB}{
      \clist_set:No{\_MyStuff_clistA}{#1}
      \clist_set:No{\_MyStuff_clistB}{#2}
      \int_compare:nNnTF {\clist_count:N \_MyStuff_clistA} = {\clist_count:N \_MyStuff_clistB} {
         \legacy_if:nTF{LastListItem}{
           \use_ii_i:nn { \LastListItemtrue }% \legacy_if_set_true:n {LastListItem}
         }{
           \use_ii_i:nn {\LastListItemfalse }% \legacy_if_set_false:n {LastListItem}
         }{
           \legacy_if:nTF{LastRow}{
             \use_ii_i:nn { \LastRowtrue }% \legacy_if_set_true:n {LastRow}
           }{
             \use_ii_i:nn { \LastRowfalse }% \legacy_if_set_false:n {LastRow}
           }{
             \legacy_if:nTF{LastRowsLastListItem}{
               \use_ii_i:nn { \LastRowsLastListItemtrue }% \legacy_if_set_true:n {LastRowsLastListItem}
             }{
               \use_ii_i:nn { \LastRowsLastListItemfalse }% \legacy_if_set_false:n {LastRowsLastListItem}
             }{
               \_MyStuff_pushbehind:nVn {\int_set:Nn \_MyStuff_tempcntA}{\_MyStuff_tempcntA}{
                 \_MyStuff_pushbehind:nVn {\int_set:Nn \_MyStuff_tempcntB}{\_MyStuff_tempcntB}{
                   % \legacy_if_set_false:n {LastListItem}
                   \LastListItemfalse
                   %\legacy_if_set_false:n {LastRow}
                   \LastRowfalse
                   % \legacy_if_set_false:n {LastRowsLastListItem}
                   \LastRowsLastListItemfalse
                   \int_set:Nn \_MyStuff_tempcntA {1}
                   \int_set:Nn \_MyStuff_tempcntB {\clist_count:N \_MyStuff_clistA}
                   \_MyStuff_MergeListsloop:N {#3}
                 }
               }
             }
           }
         }
      }{
         \msg_error:nn {MyStuff} {difflen}
      }
    }
  }
}
\cs_new:Nn \_MyStuff_MergeListsloop:N {
  \int_compare:nNnF {\_MyStuff_tempcntB} < {\_MyStuff_tempcntA} {
     \AtIfInDTLLoop{
       \DTLiflastrow{
         %\legacy_if_set_true:n {LastRow}
         \LastRowtrue
       }{}
     }{}
     \int_compare:nNnT {\_MyStuff_tempcntB} = {\_MyStuff_tempcntA}
                       {
                          %\legacy_if_set_true:n {LastListItem}
                          \LastListItemtrue
                          \legacy_if:nT {LastRow}
                                        {
                                            %\legacy_if_set_true:n {LastRowsLastListItem}
                                            \LastRowsLastListItemtrue
                                        }
                       }
     \_MyStuff_passtwoargs:nxx {#1}
       {\clist_item:Nn \_MyStuff_clistA {\_MyStuff_tempcntA}}
       {\clist_item:Nn \_MyStuff_clistB {\_MyStuff_tempcntA}}
     \int_incr:N \_MyStuff_tempcntA
     \_MyStuff_MergeListsloop:N {#1}
  }
}
\ExplSyntaxOff

\DTLsetseparator{;}%
\DTLloaddb{data}{data.csv}

\NewDocumentCommand\PrintElementsMerged{mmmm}{%
 % #1 element-separator
 % #2 row-separator
 % #3 element from first list
 % #4 element from second list
#3#1#4#2}%

\begin{document}

\noindent Lists come from database's data-fields A and B:

\noindent \MergeDatabaseCellCommaLists{data}{A}{B}{%
  \PrintElementsMerged{ }{\ifLastRowsLastListItem\else\newline\fi}%
}

\bigskip

% Inside a \DTLforeach-Loop \ListA / \ListB could come from a \DTLforeach's assignment-list-argument.
% Instead of using \PrintMacroCommaListElementsMerged you can define and use whatsoever macro that
% processes two arguments, one of them being an element of the first list, the other being an element
% of the second list:

\noindent Lists come from macros \verb|\ListA| and \verb|\ListB|---the macros in turn could be due to
the assignment-list-argument of a \verb|\DTLforeach|-loop, but in this example are defined ``by hand'':

\def\ListA{1.2.3,2.3.4,3.4.5}
\def\ListB{Some data,More data,Even more data}

\noindent \MergeCommaListsFromMacros{\ListA}{\ListB}{%
  \PrintElementsMerged{ }{\ifLastListItem\else\newline\fi}%
}

\end{document}

enter image description here

Related Question