Bibtex: Remove URL if DOI is present and keyword does not exist

biberbiblatexfilter

This question is highly related to:
Biblatex: use doi only if there is no URL

However, I cannot get it to work for my example. I have the following setup. Most of my bib entries contain DOI, URL, and sometimes keywords (not always present). I want to remove (ignore) the URL if the DOI is present and the keywords do not contain the word "primary". In other words, if it's a primary source, don't remove anything.

I set up a biber.conf, since its easier, which looks like this:

<config><sourcemap>
    <maps datatype="bibtex" map_overwrite="1">
        <map map_overwrite="1"> 
            <map_step map_field_source="doi" map_final="1"/>
            <map_step map_field_source="keywords" map_match=".*primary.*" map_final="1"/>
            <map_step map_field_set="url" map_null="1"/>
        </map>
    </maps>
</sourcemap></config>

Surprisingly, this does not work. It does exactly the opposite instead. Now URL is removed if the "primary" keyword exists. I found this really counterintuitive. So I tried the opposite and changed map_match to map_notmatch. This works, but not always. I noticed that the URL is properly removed but only if the keywords entry exists. If there is no keywords field in the bib entry at all, the URL is not filtered (see the MWE below).

<map_step map_field_source="keywords" map_notmatch=".*primary.*" map_final="1"/>

Now, this does not make any sense to me. If the keywords contain "primary" we terminate with map_final but if there is no keywords field at all, we do not terminate? So I tried to add an additional map step to check if the keywords exist but it does not work either. Can someone please help?

MWE:

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage[backend=biber]{biblatex}

\addbibresource{\jobname.bib}
\usepackage{filecontents}
% kastenholz1 URL is not filtered... WHY? (maybe because no keywords field is present?)
\begin{filecontents}{\jobname.bib}
@article{kastenholz1,
  author = {Kastenholz, M. A. and H{\"u}nenberger, Philippe H.},
  title = {Computation of methodology\hyphen independent ionic solvation free
    energies from molecular simulations},
  url = {http://dx.doi.org/10.1063/1.2172593},
  doi = {10.1063/1.2172593},
}
@article{kastenholz2,
  author = {Kastenholz, M. A. and H{\"u}nenberger, Philippe H.},
  title = {Computation of methodology\hyphen independent ionic solvation free
    energies from molecular simulations},
  url = {http://dx.doi.org/10.1063/1.2172593},
  doi = {10.1063/1.2172593},
  keywords = {secondary}
}
@article{kastenholz3,
  author = {Kastenholz, M. A. and H{\"u}nenberger, Philippe H.},
  title = {Computation of methodology\hyphen independent ionic solvation free
    energies from molecular simulations},
  url = {http://dx.doi.org/10.1063/1.2172593},
}
@article{kastenholz4,
  author = {Kastenholz, M. A. and H{\"u}nenberger, Philippe H.},
  title = {Computation of methodology\hyphen independent ionic solvation free
    energies from molecular simulations},
  url = {http://dx.doi.org/10.1063/1.2172593},
  doi = {10.1063/1.2172593},
  keywords = {primary}
}
\end{filecontents}

\begin{document}
  \nocite{*}
  \printbibliography
\end{document}

biber.conf

<config>
<sourcemap>
    <maps datatype="bibtex" map_overwrite="1">
        <map map_overwrite="1"> 
            <map_step map_field_source="doi" map_final="1"/>
            <map_step map_field_source="keywords" map_notmatch=".*primary.*" map_final="1"/>
            <map_step map_field_set="url" map_null="1"/>
        </map>
    </maps>
</sourcemap>
</config>

Output:

enter image description here

Best Answer

The outcome of the mapping step of the first biber.conf shown in the question is not unexpected.

The keyword final means that the following steps are only carried out if the condition of this step is satisfied. For map_step map_field_source="doi" map_final="1" that means that the field doi must exist. For map_step map_field_source="keywords" map_match=".*primary.*" map_final="1" that means that the contents of the keywords field must match the regular expression .*primary.* (i.e. if the field contains the substring primary). Hence, in our example we delete only DOIs of primary sources.

notmatch gets us closer to what we want, but unfortunately notmatch is false if the field in question does not exist. This means we can't easily get things done in one mapping.

I suggest you use two mappings:

  • the first mapping clears out the URL if there is no keyword at all
  • the second mapping clears out the URL if the keyword is present, but does not match primary.

In the MWE below I use \DeclareSourcemap instead of biber.conf, because that makes the example more self-contained (plus, I find the \DeclareSourcemap syntax easier to work with). If you need help with the relevant xml syntax, have a look at the .bcf file.

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage[backend=biber]{biblatex}

\DeclareSourcemap{
  \maps[datatype = bibtex]{
    \map{
      \step[notfield = keywords, final]
      \step[fieldsource = doi, final]
      \step[fieldset = url, null]
    }
    \map{
      \step[fieldsource = keywords, notmatch = \regexp{\bprimary\b}, final]
      \step[fieldsource = doi, final]
      \step[fieldset = url, null]
    }
  }
}


\begin{filecontents}{\jobname.bib}
@article{kastenholz1,
  author = {Kastenholz, M. A. and Hünenberger, Philippe H.},
  title  = {URL + DOI},
  url    = {http://dx.doi.org/10.1063/1.2172593},
  doi    = {10.1063/1.2172593},
}
@article{kastenholz2,
  author   = {Kastenholz, M. A. and Hünenberger, Philippe H.},
  title    = {URL + DOI // secondary},
  url      = {http://dx.doi.org/10.1063/1.2172593},
  doi      = {10.1063/1.2172593},
  keywords = {secondary},
}
@article{kastenholz3,
  author = {Kastenholz, M. A. and Hünenberger, Philippe H.},
  title  = {URL},
  url    = {http://dx.doi.org/10.1063/1.2172593},
}
@article{kastenholz4,
  author   = {Kastenholz, M. A. and Hünenberger, Philippe H.},
  title    = {URL + DOI // primary},
  url      = {http://dx.doi.org/10.1063/1.2172593},
  doi      = {10.1063/1.2172593},
  keywords = {primary},
}
\end{filecontents}
\addbibresource{\jobname.bib}

\begin{document}
  \nocite{*}
  \printbibliography
\end{document}

[1] M. A. Kastenholz and Philippe H. Hünenberger. “URL”. In: (). url: http://dx.doi.org/10.1063/1.2172593.
[2] M. A. Kastenholz and Philippe H. Hünenberger. “URL + DOI”. In: (). doi:10.1063/1.2172593.
[3] M. A. Kastenholz and Philippe H. Hünenberger. “URL + DOI // primary”.In: (). doi: 10.1063/1.2172593. url: http://dx.doi.org/10.1063/1.2172593.
[4] M. A. Kastenholz and Philippe H. Hünenberger. “URL + DOI // secondary”. In: (). doi: 10.1063/1.2172593.