[Tex/LaTex] Strange problem with BibTeX sorting

bibtexcharacterssorting

I know that Umlaute have to be avoided in BibTeX and I am using {\"a}, in fact I use BibDesk and it produces the correct LaTeX code. The sorting works for small characters, but I have one name starting with Ö. Although I use {{\.I}nci {\"O}zkarag{\"o}z}, the item is sorted after all authors with O. What can I do?

Apart from this Ø should be sorted last, but is sorted in the middle of O.

Edit: I want to have the German ordering, that is, Ö should be sorted as Oe not as O. The Ø-sorting is Danish.

Best Answer

When BibTex sees a sequence like {\"O} or {\O}, for sorting purposes these special characters are reduced to O. The task is therefore to trick BibTex into thinking that the first example looks like Oe, while the second should look like ZZZZZO. For that, we introduce two macros, \donothing and \printsecond.

{\donothing{text}} will show nothing in the final bibliography, but "text" will be taken into account for sorting, so {\"O}{\donothing{e}} will print as "Ö", and sorted as "Oe".

{\printsecond{foo}{bar}} will print "bar", and be sorted as "foobar".

Example:

\documentclass{article}

\usepackage{filecontents}

\begin{filecontents}{SM.bib}
  @misc{a, author={{\printsecond{ZZZZZ}{\O}}stersund, A}, title={a}, year=1000}
  @misc{b, author={{\"O\donothing{e}}zkarag{\"o}z, {\.I}nci}, title={b}, year=1001}
  @misc{c, author={Oezl, A}, title={c}, year=1002}
  @misc{d, author={Oezj, A}, title={d}, year=1003}
  @misc{e, author={Ofa, A}, title={e}, year=1004}
  @misc{f, author={Zzz, A}, title={f}, year=1005}
\end{filecontents}


\begin{document}
\nocite{*}

\providecommand*{\donothing}[1]{}
\providecommand*{\printsecond}[2]{#2}


\bibliographystyle{plain} % works
% \bibliographystyle{alpha} % works ...
% \bibliographystyle{apalike} % works
\bibliography{SM}
\end{document}

Using the plain style, the sorting keys for these entries are: zzzzzostersund1000, oezkaragoz1001, oezl1002, oezj1003, ofa1004, zzz1005, which gives the correct order: references output with plain

Using the alpha style, the entries are sorted by label, which consists of three letters of the author name, and the last two digits of the year for single author references. The sorting keys are therefore: zzzzzost1000, oezk1001, oez1002, oez1003, ofa1004, zzz1005. You can see here that the special characters are treated as one character by BibTeX, even when several plain characters are extracted when generating the sorting key. The output is therefore: references output with alpha

and indeed "Oez03" < "Özk01", as oez1003 < oezk1001, since 1 < k. If you would like it the other way round, the Özk01 has to be shortened to Öz01, which can be achieved in the database by changing the second entry to

@misc{b, author={{\"O}{\donothing{e}}zkarag{\"o}z, {\.I}nci}, title={b}, year=1001}

(Note how the "invisible" e is now a special character of its own, and counted as such, as it jumped out of the first special character.)

references output with alpha, and shorter label for Ö