Punctuation kerning with luatexja and Simplified Chinese Script

chinesekerningluatexja

I'm dissatisfied with the results of how babel renders CJK scripts (it's hard to get protrusion and punctuation kerning looking nice and lines can break before punctuation—some of these issues can be worked around easily enough, but not all).

I'm experimenting with luatexja which gives nicer rendering results, at least for Japanese.

My end goal is to have nice Chinese (simplified and traditional) and Japanese (and Korean I guess) as well as other languages that babel supports in the same document. And be able to switch between languages easily, ideally hooking into babel's \selectlanguage to make required CJK changes. I'm beginning to think this is too hard.

(Note: I'm not trying to replace babel I just want the better output that luatexja gives.)

But for Chinese scripts there are a few oddities I don't understand.

In the MWE example below, the kerning between the ?and ” seems too large in comparison with 。and ”.

Is it meant to be like this? (I'm not a Chinese reader/speaker).

If not, can I tweak luatexja in some way?

I notice that similar problems occur if I try and load Noto Serif CJK TC. The kerning between 。and ” is completely wrong in this case.

(I would like to keep using LuaLaTeX rather than XeLaTeX.)

MWE

%! TeX Program = lualatex

\documentclass{article}
\pagestyle{empty}
\usepackage{luatexja-fontspec}
\setmainjfont{Noto Serif CJK SC}
\begin{document}
“你们一路上争论些什么?”

“谁想为首,就该在众人中做最小的,做众人的仆人。”
\end{document}

Output

output

Best Answer

The question mark is the fullwidth form, so what babel and luatexja output is what I would expect. You can apply a transform to replace it by the halfwidth form:

\documentclass{article}
\pagestyle{empty}
\usepackage[chinese, provide=*]{babel}
\babelfont{rm}{Noto Serif CJK SC}
\babelprehyphenation{chinese}{ ? }{ string=? }
\begin{document}
“你们一路上争论些什么?”
\end{document}

enter image description here

A more traditional approach is making the fullwidth character active and defining it as the halfwidth form.

As to the period + quotes, it’s in principle another task for a transform, but sadly there is a bug and negative numbers raise an error. The idea is you can write something like:

% Warning: it doesn’t work!
\babelprehyphenation{chinese}{ [?。]” }{
  {},
  { insert, penalty = 10000 },
  { insert, space = -.5 0 0},
  {}
}

I’ll fix in the next release (next week, very likely).

Edit.

Here is a MWE with a workaround:

\documentclass{article}

\pagestyle{empty}
\usepackage[chinese, provide=*]{babel}
\babelfont{rm}{Noto Serif CJK SC}

\usepackage{etoolbox}
\makeatletter
\catcode`\%=12
\patchcmd{\bbl@settransform}
  {(space)%s*=%s*([%d%.]+)%s+([%d%.]+)%s+([%d%.]+)}
  {(space)%s*=%s*([%-%d%.]+)%s+([%-%d%.]+)%s+([%-%d%.]+)}
  {}{}
\catcode`\%=14
\makeatother

\babelprehyphenation{chinese}{ [?。]” }{
  {},
  { insert, penalty = 10000 },
  { insert, space = -.5 0 0},
  {}
}

\begin{document}
“你们一路上争论些什么?”

“谁想为首,就该在众人中做最小的,做众人的仆人。”

\end{document}

enter image description here

Related Question