[Tex/LaTex] Automatic break/hyphen for too long meaningless words

hyphenationline-breaking

Since there is no pattern for hyphening meaningless words, they should be individually hyphenated/broken with methods such as \hyphenation{longword}, \seqsplit{longword}, etc.

However, there should be a solution to do this automatically, because the need for hyphenation/breaking depends on the place of a word, and it is not easy to make individual hyphenation for any word that might be placed at the end of line.

\documentclass{article}
\begin{document}
This is a test for a word which is too long at the end of a line aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\\
This is a test for a word which is too long at the end of a line http://test.com/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3\\
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa This is a test for a word which is too long at the end of a line\\
http://test.com/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3/3 This is a test for a word which is too long at the end of a line\\
\end{document}

As you can see, hyphenation/breaking can be unnecessary or extremely necessary. In fact, there should be a way to forbid continuing the text to margin. This is what text editors like MS Word do. I know Tex is far beyond compare to MS Word, but it can give a clue. I thin if sending the long word to the next line is better than writing it on the margin and beyond.

The problem appear when there are many words of this kind, and it is quite difficult to treat them individually.

Best Answer

You have to die one death or the other, I fear. Either you

  • mark up your random words beforehand, then there are nice possibilities to get them hyphenated, or handled specially
  • or you watch out for "Overfull hbox in paragraph" warnings and add \hyphenation{...} exceptions afterwards whenever necessary
  • or you go the MS-Word route.

(it is a bit too much to ask for, that TeX without a little help in form of markup identifies that "vsraajlkuhzjsp" is a random word but "Antidisestablishmentarianism" is not)

So in my option the first approach is the correct one, e.g., writing something like \rw{foobar} in the source text. Then you could have TeX manipulate these random words including hyphenating them by using a special \language with a set of random hyphenation patterns (that could easily be made up) or using some macros to insert explicit hyphens in random places into the string or ...

But if markup is not an option, then you could of course go for option 3:

  • you could use \raggedright which is was Word often does to hide this kind of problem
  • or you could use something like \setlength\emergencystretch{.5\textwidth} to get justified text in (nearly) all circumstances. Of course at a price of really ugly looking paragraphs.

The result would look like this on your text (with the \\ removed): enter image description here

Related Question