[Tex/LaTex] Biblatex, biber and non-latin (cyrillic) UTF-8 names part II

biberbiblatexcyrillicunicode

I was looking for an example using biblatex, biber and mixed Latin/Greek/Cyrillic document; I found biblatex biber non-latin (cyrillic) UTF-8 names and biblatex: Abbreviating cyrillic authors' names with firstinits option, but neither of them are conclusive (and are closed). So, I came up with an MWE (below) which compiles with pdflatex test.tex; biber test; pdflatex test.tex; pdflatex test.tex (using TexLive 2014), and produces this:

/tmp/test.png

Mostly it looks as it should – my questions are:

  • Is this the correct way to use cyrillic names/titles with biblatex/biber? (that is, with \foreignlanguage?)
  • Like in the above post, the cyrillic name is not abbreviated, while the latin one is. How could I get them both to abbreviate?

EDIT: The reason for not abbreviating can be seen in the .bbl file:

\name{author}{1}{}{%
  {{hash=3bd6100cde97ee45844a484f7fc7612b}{Goethe}{G\bibinitperiod}{Johann\bibnamedelima Wolfgang}{J\bibinitperiod\bibinitdelim W\bibinitperiod}{von}{v\bibinitperiod}{}{}}%
...
\name{author}{1}{}{%
  {{hash=c78e01570c60a06ba7419ac38e11ab99}{\foreignlanguage{russian}{Фёдор\bibnamedelimb Михайлович\bibnamedelimb Достоевский}}{\\bibinitperiod}{}{}{}{}{}{}}%

Seemingly, biblatex/biber is smart enough to see through the \foreignlanguage to insert a \bibnamedelimb also for the cyrillic name, but somehow, doesn't fill all the fields (in particular, those containing the abbreviations). If I try to hack manually, and insert by analogy with the latin name, for labelname and author, this instead:

    {{hash=c78e01570c60a06ba7419ac38e11ab99}{\foreignlanguage{russian}{Достоевский}}{\foreignlanguage{russian}{Д\bibinitperiod}}{\foreignlanguage{russian}{Фёдор\bibnamedelima Михайлович}}{\foreignlanguage{russian}{Ф\bibinitperiod\bibinitdelim М\bibinitperiod}}{}{}{}{}}%

… then abbreviating works – but clearly, I wouldn't want to do this manually for each cyrillic entry, so the question can be posed: can I somehow coax biber to produce this automatically, maybe by entering some commands in the .bib file?

EDIT2: I also tried wrapping each author name word in foreignlanguage in the .bib file:

author={\foreignlanguage{russian}{Фёдор} \foreignlanguage{russian}{Михайлович} \foreignlanguage{russian}{Достоевский}},

… this doesn't work at all (there is no abbreviation, and in fact, some words are skipped in the final output).

EDIT3: As @egreg pointed out in comments (single object), I tried removing the \foreignlanguage for the author name, and write directly in UTF-8:

author={Фёдор Михайлович Достоевский},

… and now this splits name components correctly in the .bbl file:

  \name{author}{1}{}{%
    {{hash=e3c9a72252d468ac324a38026797e091}{Достоевский}{Д\bibinitperiod}{Фёдор\bibnamedelima Михайлович}{Ф\bibinitperiod\bibinitdelim М\bibinitperiod}{}{}{}{}}%

… but then, when pdflatex is ran for the second time, it crashes with:

! LaTeX Error: Command \CYRF unavailable in encoding T1.

… so apparently, font encoding switch is still needed.


Here is the code:

\documentclass{article}
\usepackage{filecontents} % tlmgr install filecontents
\begin{filecontents*}{\jobname.bib}
@BOOK{von1841faust,
  title={Faust: Eine Tragödie. Erster Theil},
  author={Johann Wolfgang von Goethe},
  url={http://books.google.dk/books?id=zF8-AAAAYAAJ},
  year={1841},
  publisher={Hermann Passarge}
}

% see: https://tex.stackexchange.com/questions/215447/using-cyrillic-with-tex-gyre-pagella-and-pdflatex
% if fonts are properly set-up, there's no need for separate \fontencoding{T2A}\selectfont, as in:
%   title={\fontencoding{T2A}\selectfont \foreignlanguage{russian}{Двойник}},
% else - with or without \fontencoding{T2A}\selectfont:
% ! LaTeX Error: Command \CYRD unavailable in encoding T1.
% with \foreignlanguage and \fontencoding{T2A}:
% LaTeX Font Warning: Font shape `T2A/qpl/m/n' undefined
% (Font)              using `T2A/cmr/m/n' instead on input line 48.
% kpathsea: Running mktexmf larm1000
% ! I can't find file `larm1000'.
% <*> ...ljfour; mag:=1; nonstopmode; input larm1000
% mktextfm: `mf-nowin -progname=mf \mode:=ljfour; mag:=1; nonstopmode; input larm1000' failed to make larm1000.tfm.
% ! Font T2A/cmr/m/n/10=larm1000 at 10.0pt not loadable: Metric (TFM) file not found.

@book{dostoevskij2014dvoinik,
  title={\foreignlanguage{russian}{Двойник}},
  author={\foreignlanguage{russian}{Фёдор Михайлович Достоевский}},
  isbn={9785000640227},
  url={http://books.google.dk/books?id=OR7BAQAAQBAJ},
  year={2014},
  publisher={Aegitas}
}

\end{filecontents*}

\usepackage[utf8]{inputenc}
\usepackage[T2A, T1]{fontenc} % before babel!
\usepackage[russian,greek,english]{babel} % tlmgr install babel-russian

\usepackage[sc]{mathpazo}
\usepackage{paratype} % tlmgr install paratype
\usepackage{tgpagella}

% T2A for cyrillic - paratype
\usepackage{substitutefont}
\substitutefont{T2A}{\rmdefault}{PTSerif-TLF}

\usepackage{siunitx}

\usepackage[%
  style=ieee,
  isbn=true,
  url=true,
  defernumbers=true,
  sorting=nyt, % "Sort by name, year, title."
  %sorting=none, % "Do not sort at all. All entries are processed in citation order." (order of appearance)
  bibencoding=utf8,
  backend=biber
]{biblatex}
\bibliography{\jobname}

\begin{document}

From \si{\mega\ohm} to \si{\micro\watt} -- consider "Faust" \cite{von1841faust}, which is in (extended) Latin script; or "\foreignlanguage{russian}{Двойник}" \cite{dostoevskij2014dvoinik}, which is in Cyrillic script.

\printbibliography

\end{document}

Best Answer

You don't need \foreignlanguage in the .bib file. Use biblatex option autolang=other instead. It automatically wraps bib entries in the otherlanguage environment. For this to work you also need langid fields in the .bib file.

This is your MWE corrected (I removed the font setup, it doesn't work on my machine for some reason --- but I guess it's a different problem):

\documentclass{article}
\usepackage{filecontents} % tlmgr install filecontents
\begin{filecontents*}{\jobname.bib}
@BOOK{von1841faust,
  title={Faust: Eine Tragödie. Erster Theil},
  author={Johann Wolfgang von Goethe},
  url={http://books.google.dk/books?id=zF8-AAAAYAAJ},
  year={1841},
  publisher={Hermann Passarge},
  langid={german},
}

@book{dostoevskij2014dvoinik,
  title={Двойник},
  author={Фёдор Михайлович Достоевский},
  isbn={9785000640227},
  url={http://books.google.dk/books?id=OR7BAQAAQBAJ},
  year={2014},
  publisher={Aegitas},
  langid={russian},
}

\end{filecontents*}

\usepackage[utf8]{inputenc}
\usepackage[T2A,T1]{fontenc} % before babel!
\usepackage[russian,greek,german,english]{babel} % tlmgr install babel-russian

\usepackage{siunitx}

\usepackage[%
  style=ieee,
  isbn=true,
  url=true,
  defernumbers=true,
  sorting=nyt, % "Sort by name, year, title."
  %sorting=none, % "Do not sort at all. All entries are processed in citation order." (order of appearance)
  bibencoding=utf8,
  backend=biber,
  language=auto,    % get main language from babel
  autolang=other,
]{biblatex}
\bibliography{\jobname}

\begin{document}

From \si{\mega\ohm} to \si{\micro\watt} -- consider "Faust" \cite{von1841faust}, which is in (extended) Latin script; or "\foreignlanguage{russian}{Двойник}" \cite{dostoevskij2014dvoinik}, which is in Cyrillic script.

\printbibliography

\end{document}

Output:

test.png

Related Question