Why do I need to do the extra work starting with \ifdefined
in order to get my French guillemets correct in the pdf output, when using xelatex
with a source specifying the use of T1
-encoded fonts ?
\documentclass[french]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\ifdefined\XeTeXinterchartoks
\catcode`« \active
\catcode`» \active
\def«{\char19 }
\def»{\char20 }% ça marche, même avec Babel+frenchb
\fi
\usepackage{newtxtext}
\usepackage{babel}
\frenchbsetup{og=«, fg=»}
\begin{document}
\showboxbreadth\maxdimen
\showboxdepth\maxdimen
\showoutput
«coucou»
\end{document}
The log contains:
Package: inputenc 2015/03/17 v1.2c Input encoding file
\inpenc@prehook=\toks14
\inpenc@posthook=\toks15
Package inputenc Warning: inputenc package ignored with utf8 based engines.
But it is loaded after fontenc
. It is not forbidden to use fontenc
with xelatex
. inputenc
is loaded after it. Thus it should know that T1-encoded font slots are to be used. Why then doesn't it do the job of making these characters active and map them to the suitable \char xx
slots ?
There is something escaping me here…
Notice that the code sample also uses babel+frenchb
which adds automatic spacing. It seems not to have been perturbed from my making the characters active.
In order to explain more the issue, consider the following input:
\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\begin{document}
\showboxbreadth\maxdimen
\showboxdepth\maxdimen
\showoutput
«coucou»
\end{document}
It produces, if compiled with xelatex
:
The explanation is simple: the ascii chars «
and »
are in slots 171
and 187
respectively. Hence the corresponding glyphs from the T1
encoding are used, giving the result. inputenc
does nothing, but it could have donc something akin to my code above.
...\hbox(6.63332+0.0)x345.0, glue set 290.00977fil
....\hbox(0.0+0.0)x15.0
....\T1/cmr/m/n/10 «
....\T1/cmr/m/n/10 c
....\T1/cmr/m/n/10 o
....\T1/cmr/m/n/10 u
....\T1/cmr/m/n/10 c
....\T1/cmr/m/n/10 o
....\T1/cmr/m/n/10 u
....\T1/cmr/m/n/10 »
Best Answer
inputenc
is abandoned because it does absolutely nothing with XeTeX or LuaLaTeX. Better said, it would do bad!See fontenc vs inputenc
Essentially, the task performed by
inputenc
is translating input characters into their LICR form. With an 8 bit engine,«
is two byte long andinputenc
is able to translate them into\guillemotleft
and»
into\guillemotright
. But for doing so it must make some characters active. Which is exactly what you do later on, andinputenc
is not instructed to do, because it's thought for an 8 bit engine.I added a friendlier interface with
newunicodechar
.If your aim is to provide translations for the characters in
t1enc.dfu
, then you can use it in a different way.A proof of concept for a package
xeinputenc
Now your test document can be
No explicit loading of
fontenc
is needed in this case, because this is already taken care of bynewtxtext
, but calls to it will be honored.