[Tex/LaTex] xeCJK messes with punctuation

cjkpolyglossiapunctuationxecjkxetex

Neither a question nor really a complain, but I find I cannot use xeCJK in multilingual documents (in the present case French + English + Chinese), simply because it messes with the punctuation in the non-Chinese parts. It will, for instance, change "this… that" to "this…that" and "this, that" to "this , that", which is inconvenient. Too bad because the package is otherwise very nice.

Here's a minimal example showing the problem:

\documentclass {memoir}
\usepackage{polyglossia}
\usepackage{xeCJK} 
\setmainlanguage{french} 
\usepackage{fontspec} 
\setmainfont{TeX Gyre Pagella} 
\setCJKmainfont {STSong} 
\begin{document} 
This… that This, that 
\end{document}

The output is the following:

output of code

The "this , that" problem appears strictly when I set the main language to French, not if I set it to English. The "this…that" problem appears irrespective of whether the main language is French or English.

And, by the way, forgive me if that's the wrong place to draw attention to this, but the package author's email address doesn't appear in the documentation and I just chanced on this site

Best Answer

This is not a bug.

xeCJK treats ambigious punctuations as CJK punctuations. The ellipsis is treated as CJK punctuation thus it uses STSong font (华文宋体) and the following spaces are ignored.

You can use \ldots, \dots or \textellipsis to get the proper ellipsis for western languages. (In fact xeCJK patches theses macros specially using \makexeCJKinactive.) Any of these should work well with xeCJK:

This\dots that This, that

This\ldots that This, that

This\textellipsis that This, that

Alan's solution is also advisable, if you use only a small amount of CJK materials. It works, but the drawback is that you must specify a chinese environment manually everywhere you use Chinese. And it also disables some functions of polyglossia.

If you use only a few words of Chinese without long sentences, you don't need xeCJK package. The special progress for CJK punctuations and spacing is overkill. xeCJK is mainly designed for CJK documents with small amounts of other languages.

Similar punctuations includes the quotes “ ” and ‘ ’, you are supposed to use tradational TeX form `foo' and ``foo''.


Warning: xeCJK is NOT supposed to be used together with polyglossia. xeCJK conflicts with polyglossia since they both use \XeTeXinterchartoks heavily, and I can't find a good way to let them work together.

xeCJK has a \makexeCJKinactive, but it does not help here. \makexeCJKinactive simply sets \XeTeXinterchartokenstate=0 and then many of the functions of polyglossia are also disabled. You should choose one of xeCJK and polyglossia.

If you decide to not to use xeCJK, you can set:

\XeTeXlinebreaklocale "zh"
\XeTeXlinebreakskip = 0pt plus 1pt minus 0.1pt

and set the Chinese font manually everytime you use Chinese. It is very suitable for temporaryly typesetting a few Chinese words. And it is very safe. For example:

\documentclass{memoir}
\usepackage{polyglossia}
\XeTeXlinebreaklocale "zh"
\XeTeXlinebreakskip = 0pt plus 1pt minus 0.1pt
\newfontfamily\stsong{STSong}
\newcommand\chinese[1]{{\stsong #1}}
\setmainlanguage{french} 
\setmainfont{TeX Gyre Pagella} 

\begin{document}
This… that This,
that «Chinese» % automatic spacing after « and before » obtained by polyglossia
\chinese{汉字}. % Chinese as normal text
\end{document}

However, if you have some long Chinese texts, you should use xeCJK for easier font switching and better puncuations. And if you use polyglossia for French, punctuation progress by \french@punctuation will be broken if you are not carefully. If you insist on using xeCJK and polyglossia together, use xeCJK before polyglossia. This might produce less errors, but I didn't make enough test, use it at your own risk.

\documentclass{memoir}
\usepackage{xeCJK}
\setCJKmainfont{STSong}
\usepackage{polyglossia} % It works fine only for simple text, but still dangerous
\setmainlanguage{french} 
\setmainfont{TeX Gyre Pagella} 

\begin{document}
This\ldots that This,
that «Chinese» % automatic spacing after « and before » obtained by polyglossia
文字. % Chinese handled by xeCJK

%%% Never use special puctuations handled by polyglossia together with CJK symbols!
%%% This shows a WRONG result, "文" is missing because of the « before it:
«文字»
\end{document}

enter image description here


Currently I am the active maintainer of xeCJK. For any question about xeCJK, you can email to me (leoliu.pku at gmail dot com) directly.

I'm sorry for the incompatibility of xeCJK and polyglossia. Maybe I'll add some commands to disable xeCJK safely in future versions.

Related Question