[Tex/LaTex] Using both unicode and russian in tex source

cyrillicunicode

EDIT: In the end I ended up using XeTeX (with auto-refreshing Evince viewer) – as suggested by @Andrey Vihrov. I am, however, accepting the most upvoted answer.

I am lost. Looked all over tex.stackexchange and can't find a good solution. Only suggestions to use babel or xetex…

I want to be able to use BOTH russian (cyrillic) and unicode characters in my latex source files. For example, this does not compile:

In Dahl’s dictionary there is a similar sounding word “дуван”...
“C’est auprès de son père, écrivain de la nation pisane à la douane de Bougie, à
la fin du douzième siècle[todo], que le célèbre mathématicien Léonard Bonacci...

If I use babel and set it to russian for above, the compiler pukes on the other non-russian unicode chars. If I set babel to english – then the russian does not work:

Package inputenc Error: Unicode char \u8:д not set up for use with LaTeX.

Please note, I don't really care for "hyphenation" and such – I can do that myself manually if need be. I just want my source documents to compile into lagex.

The problem is my main document is typeset in english, with a lot of different quotes that have languages ranging all across Europe.

Is this possible with only LaTeX (dvi)? Or must I resort to something else? I would very much prefer to stay in LaTeX – as all my compile tools are setup for it.

Either way, I would appreciate some advice.

Best Answer

You need to announce LaTeX the languages you intend to use (T2A is for cyrillic):

\documentclass{article}
\usepackage[T2A,T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[russian,french,english]{babel}

\begin{document}

In Dahl’s dictionary there is a similar sounding word “\foreignlanguage{russian}{дуван}”.

\begin{otherlanguage*}{french}
“C’est auprès de son père, écrivain de la nation pisane à la douane de Bougie, à
la fin du douzième siècle[todo], que le célèbre mathématicien Léonard
Bonacci
\end{otherlanguage*}

\end{document}

enter image description here

The main problem is that fonts have only 256 slots available for glyphs and writing in French and Russian requires more than 256 glyphs. (Maybe this is not strictly true, but even if the number of glyphs were less than 256, a special output encoding for French and Russian would be needed; what about German and Russian, Polish and Russian, or a mixing of three languages?)

You can always define an abbreviation, say \RUS, for typesetting isolated words in Russian

\newcommand{\RUS}[1]{\foreignlanguage{russian}{#1}}

(or, more efficiently, \newcommand{\RUS}{\foreignlanguage{russian}}). You have the benefit that hyphenation will be correct.

A different approach requires using an OpenType font that contains all the needed glyphs, but of course XeLaTeX or LuaLaTeX with fontspec are required.

Related Question