[Tex/LaTex] Breaking links and escaping characters in bibliographies with backlinks and ocgcolorlinks set

back-referencinghyperrefline-breakingocgurls

I have the following situation: I have to use the ocgcolorlinks option in hyperref, so that links are not broken. So I had the idea to set each char as a link with \href. After putting some code together and understanding only half of what I was doing, I arrived with a solution that worked: both the linking as well as issues with escaping the special characters in the URLs with catcode.

Unfortunately, the intention was to mainly use it in the bibliography, and when backref and pagebackref is used in hyperref, some macros or something are redefined so that the escaping again does not work there. I was not able to fix this.

In the following you find a minimal example using my ugly hack:

\documentclass{report}
\usepackage[T1]{fontenc}
\usepackage[ocgcolorlinks,breaklinks=true,backref=page,pagebackref=true]{hyperref}
\usepackage{url}

%adapted from http://tex-and-stuff.blogspot.com/2011/03/counting-number-of-characters-in-tex_10.html
\newcommand{\targetlink}{}
\def\gobblechar{\let\char= }
\newcount\charcount
\def\countunlessnil{%
  \ifx\char\nil \let\next=\relax%
  \else%
    \let\next=\auxcountchar%
    \advance\charcount by 1%
  \fi\next
}%
\def\assignthencheck{\afterassignment\checknil\gobblechar}
\def\checknil{%
  \ifx\char\nil%
     \let\next=\relax%
  \else%
     \oldhref{\targetlink}{\char}\allowbreak\let\next=\assignthencheck% %%this is the place where each char is processed
  \fi%
  \next%
}
\let\oldhref\href%
%inspired of http://tex.stackexchange.com/questions/20890/define-an-escape-underscore-environment and http://www.tex.ac.uk/cgi-bin/texfaq2html?label=actinarg
\def\href#1{
    \renewcommand{\targetlink}{#1}%
\begingroup
\catcode`\_=12
\catcode`\&=12
\catcode`\~=12
\xhref
}
\def\xhref#1{
    \assignthencheck #1\nil%
    \endgroup
}%
\def\url{%
\begingroup
\catcode`\_=12
\catcode`\&=12
\catcode`\~=12
\xurl
}%
\def\xurl#1{
    \renewcommand{\targetlink}{#1}%
    \assignthencheck #1\nil%
    \endgroup
}%

\begin{document}
\url{http://a_b/c&d/~e}
\href{http://a_b/c&d/~e}{http://a_b/c&d/~e}
$\begin{array}{cc}s&ep\end{array}up_down$no~tilde

\bibliographystyle{diss1}
\begin{thebibliography}{1}
\bibitem[t]{test}
\newblock
\url{http://a_b/c&d/~e} %error here
\href{http://a_b/c&d/~e}{http://a_b/c&d/~e} %error here
$\begin{array}{cc}s&ep\end{array}up_down$no~tilde
\end{thebibliography}
\end{document}

The output should then be two times:

 http://a_b/c&d/~e http://a_b/c&d/~e s epdownno tilde

Any idea what is happening and how I can fix it? Note that escaping the characters in the source in not really an option since I get it from a database and it would also set the links incorrectly.

Any other suggestions for cleaning the code or inserting \allowbreak only at non-letters/spaces are also welcome.

Best Answer

Edited 7/17/12 to add a workaround for XeLaTeX's lack of \pdfliteral

Edited 8/1/12 to consistently handle XeLaTeX's different location of the origin

I've been working on an alternate approach to ocgcolorlinks that doesn't prevent line breaks in the set text. And it also doesn't have to do any trickery to redefine macros, change catcodes, or even typeset the text twice. (NOTE: I think I've emailed the correct maintainers of hyperref to see if this solution useful to them, but haven't heard back; in the meantime I'm posting it if it helps other people.) Since I think your question is asking, "Is there a fairly robust way to enable line-breaks in ocgcolorlinks while preserving odd characters and macros?", I'll answer that rather than try to debug catcode and macro redefinitions. The approach will work exactly for anything that \url could handle already.

The code below relies on a quirk of the meaning of Optional Content Groups in PDFs: while color-changing operations will occur regardless of whether the OCG is enabled or not, graphics operations like filling a path are conditional. And, PDFs have a mode where text is used to set the clipping path, rather than to render as actual text. The following code redefines a hyperref internal function, and produces line-breakable optional-colored hyperlinks:

\documentclass{article}
\usepackage[hyphens]{url}
\usepackage[ocgcolorlinks]{hyperref}
\usepackage{geometry}
\geometry{paperheight=2in,paperwidth=2in,margin=0.25in}

\makeatletter
\AtBeginDocument{%
    \newlength{\temp@x}%
    \newlength{\temp@y}%
    \newlength{\temp@w}%
    \newlength{\temp@h}%
    \def\my@coords#1#2#3#4{%
      \setlength{\temp@x}{#1}%
      \setlength{\temp@y}{#2}%
      \setlength{\temp@w}{#3}%
      \setlength{\temp@h}{#4}%
      \adjustlengths{}%
      \my@pdfliteral{\strip@pt\temp@x\space\strip@pt\temp@y\space\strip@pt\temp@w\space\strip@pt\temp@h\space re}}%
    \ifpdf
      \typeout{In PDF mode}%
      \def\my@pdfliteral#1{\pdfliteral page{#1}}% I don't know why % this command...
      \def\adjustlengths{}%
    \fi
    \ifxetex
      \def\my@pdfliteral #1{\special{pdf: literal direct #1}}% isn't equivalent to this one
      \def\adjustlengths{\setlength{\temp@h}{-\temp@h}\addtolength{\temp@y}{1in}\addtolength{\temp@x}{-1in}}%
    \fi%
    \def\Hy@colorlink#1{%
      \begingroup
        \ifHy@ocgcolorlinks
          \def\Hy@ocgcolor{#1}%
          \my@pdfliteral{q}%
          \my@pdfliteral{7 Tr}% Set text mode to clipping-only
        \else
          \HyColor@UseColor#1%
        \fi
    }%
    \def\Hy@endcolorlink{%
      \ifHy@ocgcolorlinks%
        \my@pdfliteral{/OC/OCPrint BDC}%
        \my@coords{0pt}{0pt}{\pdfpagewidth}{\pdfpageheight}%
        \my@pdfliteral{F}% Fill clipping path (the url's text) with
                           % current color
        %
        \my@pdfliteral{EMC/OC/OCView BDC}%
        \begingroup%
          \expandafter\HyColor@UseColor\Hy@ocgcolor%
          \my@coords{0pt}{0pt}{\pdfpagewidth}{\pdfpageheight}%
          \my@pdfliteral{F}% Fill clipping path (the url's text)
                             % with \Hy@ocgcolor
        \endgroup%
        \my@pdfliteral{EMC}%
        \my@pdfliteral{0 Tr}% Reset text to normal mode
        \my@pdfliteral{Q}%
      \fi
      \endgroup
    }%
}
\makeatother

\begin{document}
Text before a long url that can break at hyphens, followed by the url itself
\url{http://long-url-that-can-break.com}, then a little bit more
text after that lengthy url\dots
\end{document} 

It renders text into the clipping path, then fills the clipping path with the current (normal text) color. Then, inside the OCView content group, it changes the color to \Hy@ocgcolor and fills the clipping path a second time. The net result is colored text on screen, and normal text in the printout. The text is even still selectable as text, and can be copied & pasted out of the document :)

Two caveats:

  1. If you create a hyperlink with no text in it, then the entire page will be filled with \Hy@ocgcolor, because the default clipping path is the entire page until you've added at least one character's worth of paths into it... I tried using the W n PDF operators to create a new clip path, but that failed; if anyone knows why, improvements are welcome.

  2. Do not let the url straddle a page break, or else the second page will again be filled with \Hy@ocgcolor, because the second half of the command (that resets the clipping mode and fills the clipping path) doesn't run until the second page. But since the fill command occurs on the second page, and none of the clipping path does, that whole page fills... I tried using the various AtBeginPage-like hooks to clean up and resume the clip path, but they didn't run at the correct times. Again, any expert advice is welcome.

Related Question