[Tex/LaTex] Option to break urls with carriage-return symbol

bibliographieshyperrefhyphenationurls

There are several ways to get long, uncooperative URLs to line-break in LaTeX. If you're using pdfLaTeX, then the hyperref package can break them across lines for you, roughly after any /. If you give it the [hyphens] option, it will also allow line breaks after any - characters. If you're not using pdfLaTeX, the breakurl package gives the same behavior. But the documentation for it says that it will never break URLs at dashes, because the dashes could be mistaken for hyphens that aren't part of the URL. Similarly the documentation for the hyphens option of hyperref warns that breaking after hyphens may lead to typographic confusion. Specifically, if LaTeX had to process a url like http://foo.com/is-this-read-able-or-readable, and happened to break after the third dash, then viewing the printed result:

   http://foo.com/is-this-read-
able-or-readable

is ambiguous: unless you as a reader already know that breakurl or [hyphens]{hyperref} never introduce new - characters, it is possible to read that url as is-this-readable-or-readable, which is incorrect.

So let's say you follow the warnings, and decide not to break after dashes. Then, naturally, you may find yourself with badly under- or over-full boxes. The recommended advice, that I normally follow, is to use FlushLeft, but even then you may have links that are particularly nasty to break into lines. So my scenario is:

  • Even FLushLeft thinks the line is underfull
  • There aren't any good characters to break the URL, after the last /
  • In the online case (i.e. viewing on a computer, not in printout), I need to support hyperref linking the URL properly — so weird characters shouldn't show up in the text or hyperref could get confused.
  • I'm using pdflatex, if that impacts the solution at all.

My question (which I hope isn't duplicated by the many related questions on this site!) is, is there a way to configure \url so that I can break at any character, insert a carriage-return symbol (something visually similar to \hookleftarrow, I think) as the hyphen, and still get hyperref to link properly (i.e., not think the carriage-return is part of the text)? According to this question, it's tricky, and it doesn't look like it'll play nicely with hyperref. And other suggestions on other questions mostly boil down to "use ragged-right text". Obviously, I'd rather break at a breakable point in the URL rather than resort to inserting symbols, but this URL is particularly uncooperative…

Edited: I revised the description of the problem to (hopefully) clarify what I'm aiming for. I'm aware of solutions involving breaking at dashes; I'm trying to avoid any confusion about dashes versus hyphens in the first place 🙂

Best Answer

Do you need to use breakurl? If not, \usepackage[hyphens]{url} mostly works (the box around the hypenated URL is not quite right).

\documentclass{article}
\usepackage[hyphens]{url}
\usepackage{hyperref}

\begin{document}
\url{https://tex.stackexchange.com/questions/28835/option-to-break-urls-with-carriage-return-symbol}

\bigskip% An alternative, if this is meant for online use:
\href{https://tex.stackexchange.com/questions/28835/option-to-break-urls-with-carriage-return-symbol}{Question about to break in URLs}
\end{document}

The [hyphens] option also seems to solve the problem with the bibliography (adapted form Url references in bibliography):

\documentclass{article}
\usepackage[hyphens]{url}
\usepackage{filecontents}

\begin{filecontents}{\jobname.bib}
@misc{A01,
  author = {Lerner, Ben},
  year = {2011},
  title = {Break URLs},
  url = {http://short.domain.com/very-long-dashed-section},
}
\end{filecontents}

\begin{document}
\nocite{*}
\bibliographystyle{alphaurl}
\bibliography{\jobname}
\end{document}

enter image description here

UPDATE: 2011-09-20:

Borrowing from Replace hyphenation character by a backwards arrow, this uses a \hookleftarrow to be hyphen separator and seems to work. Here I have allowed every character to be breakable, but if that is not desired, you could selectively insert the \BreakableChar only at the characters you want to be breakable:

enter image description here

\documentclass{article}
\usepackage{hyphenat}
\usepackage{url}
\usepackage{xstring}
\usepackage{forloop}
\usepackage{xcolor}
%\usepackage{filecontents}

\newsavebox\MyArrowBox%
\sbox\MyArrowBox{$\hookleftarrow$}%
\makeatletter%
\newcommand*{\BreakableChar}{%
  \leavevmode%
  \prw@zbreak%
  \discretionary{\usebox\MyArrowBox}{}{}%
  \prw@zbreak%
}%

\newcounter{index}%
\newcommand{\AddBreakableChars}[1]{%
  \StrLen{#1 }[\stringLength]%
  \forloop[1]{index}{1}{\value{index}<\stringLength}{%
    \StrChar{#1}{\value{index}}[\currentLetter]%
    {\currentLetter\BreakableChar}%
  }%
}%

\newcommand*{\MyUrl}{https://tex.stackexchange.com/questions/28835/option-to-break-urls-with-carriage-return-symbol}%

\begin{document}
\parbox{4cm}{\textcolor{blue}{\AddBreakableChars{\MyUrl}}}

\bigskip
\parbox{7cm}{\textcolor{blue}{\AddBreakableChars{\MyUrl}}}
\end{document}