[Tex/LaTex] Adding letters in hyphenation of Swedish words

hyphenationlanguages

Swedish writing conventions states that in compound words, when three successive identical consonants occur, one of them should be omitted. An example of this is glass (ice cream) and strut (cone), which form the compound word glasstrut (ice cream cone), not glassstrut.

Another example is ägg and gula, which forms äggula (egg yolk), not ägggula.

It might seem like a strange rule, and sometimes it creates strange results. As an example, it is impossible to distinguish between the written words glass-strut (ice cream cone) glass-trut (a gull made of ice cream) and glas-strut (a cone made of glass), but most words don't have this ambiguity (äggula can't mean anything else besides egg yolk regardless how it's pronounced).

However, if a hyphenation occur between the two consonants, the third one should be added.
This means that äggula should be hyphenated as ägg-gula. Other examples would be till-låta, till-lämpa, topp-position.

Is it possible to achive this kind of hyphenation with Latex? Ideally, I am searching for a solution both for the whole document (like \hyphenation works, for i.e. äggula) and in the text, for cases like glasstrut which could be hyphenated differently depending on the meaning (glass-strut, glass-trut).

Testing shows that none of the above words gets correct hyphenation (using Polyglossia and Xelatex). If I add an entry manually with \hyphenation, I get the hyphenation at the correct place, but I don't know how I could add the "missing" letter.

\documentclass{article}
\usepackage{polyglossia}
\setmainlanguage{swedish}
\begin{document}
\hyphenation{glass-trut}
Oh Norrmalmsregleringen, ho samt Västerbron och Tranebergsbron. glasstrut krigsåren var Larsson även en ledande nordisk 
\end{document}

(Hyphenation should occur in the word "glasstrut", the other words are just there for filling)

Best Answer

No, it's not possible with XeTeX, but something can be done with LuaTeX.

When (Xe)TeX decides for a possible hyphenation point it basically adds \discretionary{-}{}{}, while in the case of "äggula" you want

ägg\discretionary{-}{g}{}ula

and this is usually solved by something like

\providecommand{\allowhyphens}{\nobreak\hskip0pt\relax}
\newcommand{\ggg}{gg\discretionary{-}{\allowhyphens g}{}}

and inputting the word as

ä\ggg ula

(\allowhyphens allows TeX to break also the remainder of the composite word).

Cases like "glasstrut" where semantics is involved are best solved by not allowing hyphenation, resorting to manual insertion of \discretionary (maybe hidden in a macro) in cases splitting the word becomes necessary. See exercise 14.8 in the TeXbook (page 96).


With LuaTeX the situation is very different; you can specify

\hyphenation{ägg{-}{g}{}ula}

(there's no point in adding a new hyphenation point "äggu-la", I believe). Minimal example:

\documentclass{article}
\usepackage{fontspec}
\hyphenation{ägg{-}{g}{}ula}
\begin{document}

äggula

\parbox[t]{1pt}{äggula}
\end{document}

The three pairs of braces specify what's in the "pre-break", "post-break" and "no-break" parts just like the arguments to \discretionary.

enter image description here