The actual solution to OP's problem is to encode the < and > signs as their hexadecimal representation:
the "smaller than sign" becomes %3C
the "larger than sign" becomes %3E
If you do this together (!) with using the doi package the links in the bibliography will actually work. More information including other nasty characters can be found here
http://www.doi.org/doi_handbook/2_Numbering.html#2.2
and here
http://www.doi.org/syntax.html
Failing to do this will result in broken links no matter what.
I followed user13348's suggestion, and using his request function, I wrote a python3 script that takes a bib file and outputs a new bibfile with the DOIs it finds. I'm not using bibtool or taking any aux files.
The requirements are bibtexparser and unidecode.
#!/usr/bin/env python
import sys, re
from unidecode import unidecode
import bibtexparser
from bibtexparser.bwriter import BibTexWriter
import http.client as httplib
import urllib
# Search for the DOI given a title; e.g. "computation in Noisy Radio Networks"
# Credit to user13348, slight modifications
# http://tex.stackexchange.com/questions/6810/automatically-adding-doi-fields-to-a-hand-made-bibliography
def searchdoi(title, author):
params = urllib.parse.urlencode({"titlesearch":"titlesearch", "auth2" : author, "atitle2" : title, "multi_hit" : "on", "article_title_search" : "Search", "queryType" : "author-title"})
headers = {"User-Agent": "Mozilla/5.0" , "Accept": "text/html", "Content-Type" : "application/x-www-form-urlencoded", "Host" : "www.crossref.org"}
# conn = httplib.HTTPConnection("www.crossref.org:80") # Not working any more, HTTPS required
conn = httplib.HTTPSConnection("www.crossref.org")
conn.request("POST", "/guestquery/", params, headers)
response = conn.getresponse()
#print(response.status, response.reason)
data = response.read()
conn.close()
return re.search(r'doi\.org/([^"^<^>]+)', str(data))
def normalize(string):
"""Normalize strings to ascii, without latex."""
string = re.sub(r'[{}\\\'"^]',"", string)
string = re.sub(r"\$.*?\$","",string) # better remove all math expressions
return unidecode(string)
def get_authors(entry):
"""Get a list of authors' or editors' last names."""
def get_last_name(authors):
for author in authors :
author = author.strip(" ")
if "," in author:
yield author.split(",")[0]
elif " " in author:
yield author.split(" ")[-1]
else:
yield author
try:
authors = entry["author"]
except KeyError:
authors = entry["editor"]
authors = normalize(authors).split("and")
return list(get_last_name(authors))
print("Reading Bibliography...")
with open(sys.argv[1]) as bibtex_file:
bibliography = bibtexparser.load(bibtex_file)
print("Looking for Dois...")
before = 0
new = 0
total = len(bibliography.entries)
for i,entry in enumerate(bibliography.entries):
print("\r{i}/{total} entries processed, please wait...".format(i=i,total=total),flush=True,end="")
try:
if "doi" not in entry or entry["doi"].isspace():
title = entry["title"]
authors = get_authors(entry)
for author in authors:
doi_match = searchdoi(title,author)
if doi_match:
doi = doi_match.groups()[0]
entry["doi"] = doi
new += 1
else:
before += 1
except:
pass
print("")
template="We added {new} DOIs !\nBefore: {before}/{total} entries had DOI\nNow: {after}/{total} entries have DOI"
print(template.format(new=new,before=before,after=before+new,total=total))
outfile = sys.argv[1]+"_doi.bib"
print("Writing result to ",outfile)
writer = BibTexWriter()
writer.indent = ' ' # indent entries with 4 spaces instead of one
with open(outfile, 'w') as bibfile:
bibfile.write(writer.write(bibliography))
You can use it as such :
python3 searchdoi.py test.bib
And it will look like this :
Reading Bibliography...
Looking for Dois...
161/162 entries processed, please wait...
We added 49 DOIs !
Before: 42/162 entries had DOI
Now: 91/162 entries have DOI
Writing result to test.bib_doi.bib
You can now just check test.bib_doi.bib.
Best Answer
One way to proceed is to create a modified version of the file
plainnat.bst
, in which the functions that format and print fields such asdoi
andisbn
are reduced to stubs that do nothing:Find the file
plainnat.bst
in your TeX distribution. Make a copy of this file and call it, say,myplainnat.bst
. (Don't edit an original file of the TeX distribution directly.)Open
myplainnat.bst
in your favorite text editor, and search for the function calledformat.doi
. (In my copy of the file, it starts on line 292.)In this function, replace the line
with
In short, tell BibTeX to do nothing even if the field
doi
is non-empty. (You could go further and replace the function's entire body with{ " " }
. However, if you ever choose to undo some of these edits, it may be easier to do so if you leave behind for more than that absolute minimum code snippet.)Repeat this procedure, as needed, for the functions
format.url
,format.issn
, andformat.isbn
.Save the file
myplainnat.bst
, either in the directory where your main.tex
file is located or in a directory that's searched by BibTeX. If you choose the latter option, you'll probably need to update the filename database of your TeX distribution too.Start using the new bibliography style via