[Tex/LaTex] BibLaTeX: Citation Sorting with Split Bibliographies and Filters

biblatexsubdividing


This question is related to these questions, but with an extended scope:

Additionally, I filed an issue on the BibLaTeX GitHub repository, whereby my question has been redirected to this place.


As in the questions mentioned above, I have a document using split bibliographies with BibLaTeX and Biber. The individual bibliography sections are sorted by name, year, title (nyt) and numerical labels are used together with the numeric-comp citation style. I also want to achieve that multiple citations are sorted in compressed numerical ascending order, e.g. [1, 3, 8--10] instead of [3, 9, 1, 10, 8]. From the solutions to the questions above, I have learned that this can be achieved by presorting the bibliography entries with Biber using the \DeclareSourcemap directive.

However, this means that one has to define the split criterions twice: One time within the \printbibliography options and additionally in the \DeclareSourcemap map definitions. This might be easy when using single keywords to split the bibliographies.

In contrast to the aforementioned questions, I am separating my bibliography sections by using complex filters defined with the \defbibfilter directive. Therefore I would have to translate these filters to map directives for \DeclareSourcemap, too. This might not be impossible, but it costs a lot of effort and is not a trivial thing to do.

Initial Situation

Please consider the following MWE (the example bibliography file can be found at the very end of this post):

\documentclass{article}
\usepackage[
    defernumbers=true,          % continuous numbering across bibliography sections
    sortcites=true,             % sort citations if multiple entry keys are passed
                                %   to a citation command, e.g. \cite{a,b,c}
    citestyle=numeric-comp      % compact variant of numeric, which prints a list
                                %   of more than two consecutive numbers as a range
]{biblatex}
\addbibresource{example.bib}

% to show the corresponding entry keys after the individual
% entries within the bibliography
\renewbibmacro{finentry}{\finentry\space\textbf{(\thefield{entrykey})}}

% eliminate paragraph indentation
\setlength{\parindent}{0em}

\begin{document}

    \textbf{Examples:}
    \begin{itemize}
        \item Multiple citations \cite{c,d,f,a,i,h} not sorted numerically ascending;
              should be \verb|[2--4, 8--10]|.
        \item Multiple citations \cite{d,f,a,i,h} not sorted numerically ascending;
              should be \verb|[2--4, 8, 10]|.
    \end{itemize}

    \section*{References}
    \nocite{*}

    \textbf{References with Keyword A}
    \printbibliography[heading=none, keyword={A}]

    \textbf{References with Keyword B}
    \printbibliography[heading=none, keyword={B}]

\end{document}

Compiled output of MWE #1

As you can see, the citation labels are not sorted.

Solution with `\DeclareSourcemap`

Presorting the bibliography entries with Biber can be achieved by using a \DeclareSourcemap directive, as explained in the answers to the questions referenced above. This presorting leads to the desired numerical ascending sorting of the citation labels in multiple citations as well, as seen here:

\documentclass{article}
\usepackage[
    defernumbers=true,          % continuous numbering across bibliography sections
    sortcites=true,             % sort citations if multiple entry keys are passed
                                %   to a citation command, e.g. \cite{a,b,c}
    citestyle=numeric-comp      % compact variant of numeric, which prints a list
                                %   of more than two consecutive numbers as a range
]{biblatex}
\addbibresource{example.bib}

% to show the corresponding entry keys after the individual
% entries within the bibliography
\renewbibmacro{finentry}{\finentry\space\textbf{(\thefield{entrykey})}}

% eliminate paragraph indentation
\setlength{\parindent}{0em}

% presorting of the bibliography entries with Biber
\DeclareSourcemap{
  \maps[datatype=bibtex]{
    \map{
      \step[fieldsource=keywords, match={A}, final]
      \step[fieldset=presort, fieldvalue = {A}]
    }
    \map{
      \step[fieldset=presort, fieldvalue = {B}]
    }
  }
}

\begin{document}

    \textbf{Examples:}
    \begin{itemize}
        \item Multiple citations \cite{c,d,f,a,i,h} sorted numerically ascending and
          compressed; expectation \verb|[2--4, 8--10]| met.
        \item Multiple citations \cite{d,f,a,i,h} sorted numerically ascending and
          compressed; expectation \verb|[2--4, 8, 10]| met.
    \end{itemize}

    \section*{References}
    \nocite{*}

    \textbf{References with Keyword A}
    \printbibliography[heading=none, keyword={A}]

    \textbf{References with Keyword B}
    \printbibliography[heading=none, keyword={B}]

\end{document}

Compiled output of MWE #2

Situation with Bibliography Filters

As demonstrated in the previous example, a workaround is to use \DeclareSourcemap. However, this means to implement the filter criterion a second time. The situation gets worse when using complex bibliography filters with \defbibfilter. The filters in the following MWE are kept simple on purpose — I know that this could have been achieved with the standard options of \printbibliography, too; it is just meant for illustration.

\documentclass{article}
\usepackage[
    defernumbers=true,          % continuous numbering across bibliography sections
    sortcites=true,             % sort citations if multiple entry keys are passed
                                %   to a citation command, e.g. \cite{a,b,c}
    citestyle=numeric-comp      % compact variant of numeric, which prints a list
                                %   of more than two consecutive numbers as a range
]{biblatex}
\addbibresource{example.bib}

% to show the corresponding entry keys after the individual
% entries within the bibliography
\renewbibmacro{finentry}{\finentry\space\textbf{(\thefield{entrykey})}}

% eliminate paragraph indentation
\setlength{\parindent}{0em}

% definition of filters for splitting bibliographies
\defbibfilter{only-A}{ keyword={A} and not ( keyword={B} or keyword={C} ) }
\defbibfilter{A-and-C}{ keyword={A} and keyword={C} }

\begin{document}

    \textbf{Examples:}
    \begin{itemize}
        \item Multiple citations \cite{c,d,f,a,i,h} not sorted numerically ascending;
              should be \verb|[2--4, 8--10]|.
        \item Multiple citations \cite{d,f,a,i,h} not sorted numerically ascending;
              should be \verb|[2--4, 8, 10]|.
    \end{itemize}

    \section*{References}
    \nocite{*}

    \textbf{References only with Keyword A}
    \printbibliography[heading=none, filter={only-A}]

    \textbf{References with Keyword B}
    \printbibliography[heading=none, keyword={B}]

    \textbf{References with Keyword A and C}
    \printbibliography[heading=none, filter={A-and-C}]

\end{document}

Compiled output of MWE #3

Again, to have the citation labels sorted numerically, one would have to implement the bibliography filter a second time within the \DeclareSourcemap directive. This will double effort and is a potential source of errors and inconsistencies. For complex filters, this is also a non-trivial thing to do. My question is now: Does there an alternative way exist to have the citation labels sorted numerically ascending in this case, without re-implementing the bibliography filters within a \DeclareSourcemap filters directive?


Example BibTeX Database

@INPROCEEDINGS{a,
    author = {Kishore Papineni and Salim Roukos and Todd Ward and Wei-Jing Zhu},
    title = {BLEU: A method for automatic evaluation of machine translation},
    booktitle = {In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics},
    year = {2002},
    pages = {311--318},
    keywords = {A, C}
}

@INPROCEEDINGS{b,
    author = {D D Clark},
    title = {The design philosophy of the DARPA internet protocols},
    booktitle = {in ACM SIGCOMM},
    year = {1988},
    pages = {106--114},
    keywords = {B}
}

@INPROCEEDINGS{c,
    author = {Muthu Muthukrishnan},
    title = {Data Streams: Algorithms and Applications},
    booktitle = {of Foundations and Trends in Theoretical Computer Science. Now Publishers Inc},
    year = {2005},
    keywords = {B}
}

@ARTICLE{d,
    author = {R Koenker and G Basset},
    title = {Regression Quantiles},
    journal = {Eco- nometrica},
    year = {1978},
    number = {46},
    pages = {33--50},
    keywords = {A}
}

@MISC{e,
    author = {Michael Wilkinson Institute and Michael H. F. Wilkinson},
    title = {FIRST PUBLISHED IN ANNALS OF IMPROBABLE RESEARCH, VOL. 8, NO. 4, PP. 6-7 1 Zero-Tolerance Math: A Defense of "No Math"},
    year = {},
    keywords = {B}
}

@ARTICLE{f,
    author = {R M Neal},
    title = {Markov chain sampling methods for dirichlet process mixture models},
    journal = {Journal of Computational and Graphical Statistics},
    year = {2000},
    number = {9},
    pages = {249--265},
    keywords = {A, C}
}

@ARTICLE{g,
    author = {K P Birman},
    title = {The process group approach to reliable distributed computing},
    journal = {Communications of the ACM},
    year = {1993},
    number = {36},
    pages = {37--53},
    keywords = {A}
}

@ARTICLE{h,
    author = {Joxan Jaffar and Michael J Maher},
    title = {Constraint logic programming: A survey},
    journal = {Journal of Logic Programming},
    year = {1994},
    number = {19},
    pages = {581},
    keywords = {B}
}

@INPROCEEDINGS{i,
    author = {M J Wainwright and M I Jordan},
    title = {Graphical models, exponential families, and variational inference},
    booktitle = {Foundations and Trends R© in Machine Learning},
    year = {2008},
    keywords = {B}
}

@MISC{j,
    author = {K Train},
    title = {Discrete choice methods with simulation},
    year = {2003},
    keywords = {A}
}

Best Answer

With your example it is possible to get the wanted result without a sourcemap (make sure to delete old aux-files before trying and compile at least twice after biber):

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[
    defernumbers=true,          % continuous numbering across bibliography sections
    sortcites=true,             % sort citations if multiple entry keys are passed
                                %   to a citation command, e.g. \cite{a,b,c}
    citestyle=numeric-comp      % compact variant of numeric, which prints a list
                                %   of more than two consecutive numbers as a range
]{biblatex}
\addbibresource{example.bib}

% to show the corresponding entry keys after the individual
% entries within the bibliography
\renewbibmacro{finentry}{\finentry\space\textbf{(\thefield{entrykey})}}

\makeatletter
\AtEveryBibitem{\immediate\write\@mainaux{\string\listxadd\string\mynewsortbiblist{\string\detokenize{\thefield{entrykey}}}}}
\AtBeginDocument{\ifundef\mynewsortbiblist{}{\cslet{blx@slist@centry@\the\c@refsection @\blx@refcontext@context}\mynewsortbiblist}}
\makeatother

\begin{document}
    \textbf{Examples:}
    \begin{itemize}
        \item Multiple citations \cite{c,d,f,a,i,h}  not sorted numerically ascending;
              should be \verb|[2--4, 8--10]|.
        \item Multiple citations \cite{d,f,a,i,h} not sorted numerically ascending;
              should be \verb|[2--4, 8, 10]|.
    \end{itemize}

    \section*{References}
    \nocite{*}

    \textbf{References with Keyword A}
    \printbibliography[heading=none, keyword={A}]

    \textbf{References with Keyword B}
    \printbibliography[heading=none, keyword={B}]


\end{document}

enter image description here

Attention!

There are some open questions

  • it currently only works for one refsection and refcontext (as I actually don't know what should be the result with more than one).
  • I didn't try what happens if one bib entry is in more than one bibliography (I now tried and the result is senseless, so one need to add some code to avoid duplicates).

enter image description here

  • I didn't try what happens if a bib entry is missing in the printed bibliography