[Tex/LaTex] Hyphenation problem with — versus \textemdash

hyphenationpunctuation

While investigating problems with hyphenation and em-dashes, I came across this:

a piece---perhaps an installation---, involving structural feedback

The document class is article. The problem was that installation was not hyphenated, and an overfull line was produced, with installation--- moving over the right margin, and the next line beginning with the comma.

Now I happened to try this, and it fixed the problem:

a piece---perhaps an installation\textemdash, involving structural feedback

properly breaking the line such as

...instal-
lation---, involving...

This seems counter-intuitive to me, and I really prefer to see the triple hyphens in my source code, much more readable. Is there a way to fix this problem?


It looks clearly like a latex error to me, because standard usage of em-dash states: "According to most American sources […] and some British sources […], an em dash should always be set closed, meaning it should not be surrounded by spaces."

Best Answer

You have, at least, three possible options:

  1. Use \textemdash instead of ---. You already mentioned this in your question and it seems counter-intuitive for you.
  2. Manually declare the valid hyphenation points for the word preceding the ---.
  3. Load the babel package and use \allowhyphens---. I think this approach will give you something closer to what you want since you could declare some character as active (" for example) and define a command "--- to be \allowhyphens---; you then will write "--- to get an em-dash allowing hyphenation for the preceding word (this approach was used by Javier Bezos in his implementation of a similar shorthand in the spanish module for babel).

Here's some code showing the problem and the three alternatives I mentioned:

\documentclass{article}
\usepackage[english]{babel}

\begin{document}

some filler text to illustrate the problem here's a piece---perhaps an installation---, involving structural feedback

some filler text to illustrate the problem here's a piece---perhaps an installation\textemdash, involving structural feedback

some filler text to illustrate the problem here's a piece---perhaps an installation\allowhyphens---, involving structural feedback

some filler text to illustrate the problem here's a piece---perhaps an in\-stal\-la\-tion---, involving structural feedback

\end{document}

enter image description here

Here's some code that could be used to implement my suggestion in the third numeral above :

\documentclass{article}
\usepackage[english]{babel}

\catcode`"=13
\def"-{\allowhyphens-}

\begin{document}

some filler text to illustrate the problem here's a piece---perhaps an installation---, involving structural feedback

some filler text to illustrate the problem here's a piece---perhaps an installation"---, involving structural feedback

\end{document}

enter image description here

Yet another alternative, taken from this answer by Herbert to babel: Adding ngerman' s language shorthands to english as the main document language (thanks to egreg for pointing out this question) is to use the spanish modulo as a secondary language and use its already existing shorthand:

\documentclass{article}
\usepackage[spanish,english]{babel}
\useshorthands{"}
\addto\extrasenglish{\languageshorthands{spanish}}

\begin{document}

some filler text to illustrate the problem here's a piece---perhaps an installation---, involving structural feedback

some filler text to illustrate the problem here's a piece---perhaps an installation"---, involving structural feedback

\end{document}

enter image description here