[Tex/LaTex] It doesn’t hyphenate words ending with “—”

fontspechyphenationspacingspanishxetex

UPDATE

Well. After a while, I decided to see again about this “problem”, and just discovered that in pdfLaTeX, using \usepackage[utf8]{inputenc} and the (the unicode —, not the ligature ---) it works perfect (at least in what I tried). Minor edit: as @cfr mentions in her answer (I forgot it), it is in fact possible to use --- (the ligature) if you use T1 encoding and \hyphenchar\font=\string"7F. In both methods (wether I use the em-dash or ---) microtype works perfectly.

Now, the problem remains in XeLaTeX. I would like to clear my idea: I want the em-dash to behave correctly; it doesn't matter if I have to type (the unicode em-dash —) or --- (the usual LaTeX way); it should (among others) hyphenate words correctly; be always together to the word; in case it's followed by a comma/period, they should be together; and, microtype should work, e.g., the hyphen should still hang in the margin.

Of course, there is a basic solution (which works in any engine): indicate to XeLaTeX the breaking points, for instance ocur\-recone\-stedoc\-umento---. But I'm looking for an automatic solution.

If you have anything to say, please, say it, it's welcomed!


Note: Every you see in the code is an em-dash.

I'writing a paper, and XeLaTeX doesn't hyphenate words which end with (traditional LaTeX ---). After reading the comments, I realized this is a common problem also in LaTeX (not only XeLaTeX). Here it is a minimal working example:

This code (full example at the bottom of the question) outputs

—Hola, esto es un texto absurdo —para ejemplificar lo que ocurreconestedocumento— con algunas palabras más.

enter image description here

If we substitute the last by a comma, for example it hyphenates correctly

—Hola, esto es un texto absurdo —para ejemplificar lo que ocurreconestedocumento, con algunas palabras más.

enter image description here

Moreover, if the phrase ends with —. XeLaTeX (or whoever is doing this) takes the full stop to the next line

—Hola, esto es un texto absurdo —para ejemplificar lo que ocurreconestedocumento—.

enter image description here

Here is a full minimal working example

\documentclass{scrartcl}
\usepackage[hmargin = 4cm]{geometry}
\usepackage{fontspec}

\begin{document}
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

—Hola, esto es un texto absurdo —para ejemplificar lo que ocurreconestedocumento— con algunas palabras más.

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
\end{document}

Any ideas?

Best Answer

You can prepare the following example:

\input ucode
\input lmfonts

\hsize=12cm

—Hola, esto es un texto absurdo —para ejemplificar lo que
ocurreconestedocumento— con algunas palabras más.

\end

and you can try to process it by 1) xetex test and 2) xetex -fmt pdfcsplain test. You will see different results: 1) the long word isn't hyphenated, 2) the long word is hyphenated.

You can study the difference in code settings, hyphenation settings etc. IMHO LuaLaTeX copies the settings from xetex's eplain, no from pdfcsplain. So, you have the problem.

Edit: Where is the difference? We can see the following setting in xetex.ini and in xelatex.ini:

 \XeTeXdashbreakstate=1

Bingo! Here is the problem. Set \XeTeXdashbreakstate=0 and your words will be hyphenated.

Related Question