[Tex/LaTex] LTR sequences within RTL text – alternative to cumbersome markup

right-to-leftxepersianxetex

I am writing a book in Persian but text in Latin scripts is scattered all over the book.

I have to wrap every Latin script with \lt{} to guide xetex to align words from left to right. It's really cumbersome.

If I do not use \lr{} output of One Two Three will be Three Two One in Persian documents and the output of یک دو سه will be سه دو یک‍‍ in English documents.

This is a very basic requirement and I wonder why xetex can not do it without extra markup.
Is there any way for not using \lr{}

Best Answer

To start it, this is not as basic as you claim it to be. You may be able to do it with using \XeTeXinterchartoks primitive of XeTeX. Aan example:

The following was a response of Jonathan Kew (the author of XeTeX) to me a while ago, I just modified his example to work with XePersian:

\documentclass{article}
\usepackage{xepersian}
\makeatletter
% classes 1-3 are used in unicode-letters.tex, so we'll put the Latin  letters in 4
\newcount\xp@n
\xp@n=`\A \loop \XeTeXcharclass \xp@n=4 \ifnum\xp@n<`\Z \advance\xp@n by 1 \repeat
\xp@n=`\a \loop \XeTeXcharclass \xp@n=4 \ifnum\xp@n<`\z \advance\xp@n by 1 \repeat
% when we encounter class 4, we'll do \startlatin
\XeTeXinterchartoks 0 4 {\startlatin}
\XeTeXinterchartoks 255 4 {\startlatin}
% and when we encounter class 0, we'll do \finishlatin
\XeTeXinterchartoks 255 0 {\finishlatin}
\XeTeXinterchartoks 4 0 {\finishlatin}
\newcommand{\startlatin}{\if@Latin\else\bgroup\beginL\latinfont\@Latintrue\fi}
\newcommand{\finishlatin}{\if@Latin\unskip\endL\egroup{ }\fi}
\makeatother
\XeTeXinterchartokenstate=1
\begin{document}
این یک آزمایش است
One Two Three
و ادامه آن
\end{document}
Note that it both changes font (to latin font) and direction (to LTR).

However, I suspect you're not really going to be able to do this on a large scale, because it will be too difficult to handle things like
punctuation and spacing at direction changes. In unidirectional text, it may not matter whether the "language switch" happens before or
after the space (or punctuation mark), but with bidi it does matter. I think in the end you're still going to need markup if you want to
reliably mix LR and RL scripts.

In Addition, LR and RL scripts share some characters. So for example, how would you be able to decide if ) or ( is a RL chracter or an LR one?

Alternatively, you may be able to implement a preprocessor (written in C or any other language) that converts say, test.tex to test1.tex and places all LR words inside \lr. Actually BiDiTeX exists so you may be able to get its sources and modify it a bit to work with bidi/XePersian packages.

Related Solutions

Fonts – Can XeTeX or LuaTeX Use MetaFont Fonts?

I can't speak for XeTeX, although I am very sure that both LuaTeX and XeTeX can load metafont fonts. So the answer to your question is "yes". But if there is a Type1 or OpenType alternative, I'd always go for that.

LuaTeX loads cmr10 by default, not latin modern. Of course one could change that, but the idea is to get as much portability between the different engines as possible. So if you run a document through PDFTeX and through LuaTeX, both results should be the same.

The user is supposed to use the fonts he or she wants. Computer Modern und thus Latin Modern has the huge advantage to be a) free b) a big set of fonts (italic, monospaced, sans serif,...) and distributed with all ancient TeX systems. If the distributions chose another font as the default font, nothing much will be gained as there is not really much to do more than:

\documentclass{article}
\usepackage{fontspec}
\setmainfont{Linux Libertine O}
\begin{document}
Hello world.
\end{document}

While in principle I like your idea, I think it is extremely unlikely to ever happen, so I would not bother to think too much about it.

[Tex/LaTex] Sample files of LTR text on left and RTL text on right

Use xelatex and the polyglossia package along with the multicols package. Here's an example. Since I don't know Persian, I've just used Google translate to translate something. I'm sure the translation isn't very good.

% !TEX TS-program = XeLaTeX

\documentclass[12pt]{article}
\usepackage{multicol}
\usepackage{polyglossia}
\usepackage{fontspec}
\setmainlanguage{english}
\setotherlanguage{farsi}
\newfontfamily\farsifont[Script=Arabic]{Scheherazade}

\begin{document}
\begin{multicols}{2}
This is some text that is in English and since I know English I didn't have to use Google Translate to translate it.
\columnbreak

\begin{farsi}
برخی از متن که به زبان فارسی است، اما من فارسی صحبت نمی کنم، بنابراین من برخی از انگلیسی به فارسی با استفاده از گوگل ترجمه، ترجمه شده است. من کاملا مطمئنم که ترجمه واقعا افتضاح است.
\end{farsi}
\end{multicols}
\end{document}

output of code

Best Answer

Related Solutions

Fonts – Can XeTeX or LuaTeX Use MetaFont Fonts?

[Tex/LaTex] Sample files of LTR text on left and RTL text on right

Related Question