[Tex/LaTex] Making and sorting index for Russian

xindy

Trying to make index with Russian words, MWE:

\documentclass{book}
\usepackage[T1, T2A]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[english, russian]{babel}
\usepackage[xindy]{imakeidx}
\makeindex

\begin{document}

\chapter{Первая}

\index{notepad}
\index{apple}
\index{часть}
\index{дерево}
\index{электрон}

\printindex

\end{document}

I use MiKTeX (full and updated), run: latexmk.exe -pdf file.tex

This gives me file.pdf with russian words in Index, but they are sorted in wrong way. file.idx is:

\indexentry{notepad}{1}
\indexentry{apple}{1}
\indexentry{\IeC {\cyrch }\IeC {\cyra }\IeC {\cyrs }\IeC {\cyrt }\IeC {\cyrsftsn }}{1}
\indexentry{\IeC {\cyrd }\IeC {\cyre }\IeC {\cyrr }\IeC {\cyre }\IeC {\cyrv }\IeC {\cyro }}{1}
\indexentry{\IeC {\cyrerev }\IeC {\cyrl }\IeC {\cyre }\IeC {\cyrk }\IeC {\cyrt }\IeC {\cyrr }\IeC {\cyro }\IeC {\cyrn }}{1}

And I see the problem is that LaTeX makes index according to these {\cyrch}, {\cyre} etc. I thought using utf8 encoding and xindy could solve all the problems with languages. How to make russian index with correct sorting: дерево, часть, электрон in my case?

Best Answer

Due to some constraints, the \index command doesn't work very well with UTF-8 characters (something which could only be solved with a brand new version, I'm afraid.)

You can overcome the issue by doing

\documentclass{book}
\usepackage[T1, T2A]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[english, russian]{babel}
\usepackage[xindy]{imakeidx}
\makeindex

\makeatletter
\newcommand{\rindex}[2][\imki@jobname]{%
  \index[#1]{\detokenize{#2}}%
}
\makeatother

\begin{document}

\chapter{Первая}

\rindex{notepad}
\rindex{apple}
\rindex{часть}
\rindex{дерево}
\rindex{электрон}

\printindex

\end{document}

Calling

texindy -L russian -C utf8 <filename>.idx

produces the index as expected:

Related Solutions

[Tex/LaTex] glossaries special sorting for one glossary

As from glossaries version 4.04, it's now possible to use different sort methods, but only if you use the \makenoidxglossaries option which uses TeX to perform the sorting:

\documentclass{article}

\usepackage[nopostdot,nogroupskip]{glossaries}

\newglossary{nomen}{}{}{Nomenclature}

\makenoidxglossaries

% main glossary

\newglossaryentry{goose}{name={goose},description={}}
\newglossaryentry{zebra}{name={zebra},description={}}
\newglossaryentry{duck}{name={duck},description={}}
\newglossaryentry{aardvark}{name={aardvark},description={}}

% nomenclature
\newglossaryentry{X}{type=nomen,name={X},description={}}
\newglossaryentry{hashX}{type=nomen,name={\#X},description={}}
\newglossaryentry{X-index}{type=nomen,name={X\_index},description={}}
\newglossaryentry{hashX-index}{type=nomen,name={\#X\_index},description={}}
\newglossaryentry{Xindex}{type=nomen,name={Xindex},description={}}
\newglossaryentry{hashXindex}{type=nomen,name={\#Xindex},description={}}

\begin{document}
Test document.

\glsaddall

\printnoidxglossary[sort=word]

\printnoidxglossary[type=nomen,sort=def]

\end{document}

(Two LaTeX runs are required to display the glossaries.) This produces:

Image of resulting glossaries

The drawback here is that when this method is used for alphabetical sorting it can be very slow, orders by character code, and has problems if the sort value contains commands. The glossaries-extra extension package provides a hybrid method allowing you to use makeindex/xindy for the alphabetical sorting and the "noidx" method for ordering by use/definition.

\documentclass{article}

\usepackage[nogroupskip]{glossaries-extra}

\newglossary{nomen}{}{}{Nomenclature}

\makeglossaries[main]

% main glossary

\newglossaryentry{goose}{name={goose},description={}}
\newglossaryentry{zebra}{name={zebra},description={}}
\newglossaryentry{duck}{name={duck},description={}}
\newglossaryentry{aardvark}{name={aardvark},description={}}

% nomenclature
\newglossaryentry{X}{type=nomen,name={X},description={}}
\newglossaryentry{hashX}{type=nomen,name={\#X},description={}}
\newglossaryentry{X-index}{type=nomen,name={X\_index},description={}}
\newglossaryentry{hashX-index}{type=nomen,name={\#X\_index},description={}}
\newglossaryentry{Xindex}{type=nomen,name={Xindex},description={}}
\newglossaryentry{hashXindex}{type=nomen,name={\#Xindex},description={}}

\begin{document}
Test document.

\glsaddall

\printglossary

\printnoidxglossary[type=nomen,sort=def]

\end{document}

The optional argument to \makeglossaries should be a list of glossary labels identifying those that need sorting by makeindex (or xindy if the xindy package option is used). Any remaining glossaries (nomen in this case) are treated as though \makenoidxglossaries was used. The result is the same as the previous example but the build process needs to include makeindex/xindy (either explicitly or through the helper scripts makeglossaries or makeglossaries-lite). Both makeglossaries and makeglossaries-lite can detect from the .aux file that only the main glossary needs processing for this example.

There is also a third method which is to use glossaries-extra with bib2gls. This requires some modification to the document as now the entries must be defined in one or more .bib files.

For example, the file entries.bib may contain the alphabetical entries. If the entry has a description you can use @entry:

@entry{goose,
  name={goose},
  plural={geese},
  description={long-necked waterbird}
}

If the entry doesn't have a description you can use @index:

@index{goose,
  name={goose},
  plural={geese}
}

With @index if the name field exactly matches the label, you can omit it:

@index{goose,
  plural={geese}
}

(You can't do this with @entry.) The label may only contain UTF-8 characters if you are using XeLaTeX or LuaLaTeX.

So here's entries.bib:

@index{goose,
 plural={geese}
}

@index{zebra}

@index{duck}

@index{aardvark}

The other entries can be defined with @entry. For example:

@entry{X,
  name={X},
  description={}
}

but since these entries are like symbols, you can use @symbol instead (which, like @index, doesn't require the description field):

@symbol{X,
  name={X}
}

Here's symbols.bib:

@symbol{X,
  name={X}
}

@symbol{hashX,
  name={\#X}
}

@symbol{X-index,
  name={X\_index}
}

@symbol{hashX-index,
  name={\#X\_index}
}

@symbol{Xindex,
  name={Xindex}
}

@symbol{hashXindex,
  name={\#Xindex}
}

The document now looks like:

\documentclass{article}

\usepackage[record,% use bib2gls
 nogroupskip]{glossaries-extra}

\newglossary{nomen}{}{}{Nomenclature}

\GlsXtrLoadResources[
  src={entries}, % terms defined in entries.bib
  type={main}, % put these terms in the 'main' glossary
  save-locations=false,% no location list required
  selection={all}% select all terms
]

\GlsXtrLoadResources[
  src={symbols}, % terms defined in symbols.bib
  type={nomen}, % put these terms in the 'nomen' glossary
  sort=none,% don't sort
  save-locations=false,% no location list required
  selection={all}% select all terms
]


\begin{document}
Test document.

\printunsrtglossaries

\end{document}

If the file is called myDoc.tex then the document build is:

pdflatex myDoc
bib2gls myDoc
pdflatex myDoc

(If you want letter groups you need bib2gls -g or bib2gls --group.)

The resulting document doesn't show the locations as I've used save-locations=false. If you want locations, omit that option. If you only want to select entries that have been used in the document, omit selection=all.

The default behaviour is to sort by the document locale. In this example, the locale hasn't been set so bib2gls falls back on the system locale (which for me is en-GB). You can specify a different language/locale, for example, sort=de-CH-1996 or sort=pt-BR or sort=en. Order of definition is obtained with sort=none. There are other sort methods available.

Best Answer

Related Solutions

[Tex/LaTex] glossaries special sorting for one glossary

Related Question