[Tex/LaTex] Cyrillic in URLs using hyperref produces file link instead of URL link

cyrillichyperrefxetex

I have an URL containing Cyrillic characters, e.g.:
\href{https://bg.wikipedia.org/wiki/Начална_страница}{Bulgarian Wikipedia Main Page}

When I compile the document, the link points to a local file such as: file:\\\Users\me\path\to\source\file\location\followed_by_something_mangled
which is obviously not the desired result.

It appears that \href cannot properly identify the link type ("url link") and detects a "file link".

One way to circumvent this is described here (in short: one encodes the original Cyrillic-containing URL to https://bg.wikipedia.org/wiki/%D0%9D%D0%B0%D1%87%D0%B0%D0%BB%D0%BD%D0%B0_%D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0)

This, however, aside from being ugly, contains all those % signs, which are interpreted as comments when \href is used inside, say, \textit{} and the compilation fails because of parenthesis mismatch.

Any ideas how to fix that?

P.S. I use XeLaTeX and my preamble is:

\documentclass[12pt,a4paper]{book}

\usepackage{fontspec}
\usepackage{xunicode}
\usepackage{xltxtra}
\setmainfont{Free Serif}

\usepackage{hyperref}

A similar problem appears when I use pdfLaTeX + inputenc

Best Answer

The address with percent encoding is correct for the first argument of \href. If \href is used inside the argument of another command, then the percent % can be escaped with the backslash \% to prevent the percent character being interpreted as comment character:

\documentclass{article}
\usepackage{hyperref}
\begin{document}
\textit{%
  \href{https://bg.wikipedia.org/wiki/%
    \%D0\%9D\%D0\%B0\%D1\%87\%D0\%B0\%D0\%BB\%D0\%BD\%D0\%B0_%
    \%D1\%81\%D1\%82\%D1\%80\%D0\%B0\%D0\%BD\%D0\%B8\%D1\%86\%D0\%B0}{%
    Bulgarian Wikipedia Main Page}%
}
\end{document}

Another variation:

\documentclass{article}
\usepackage{hyperref}

\begingroup
  \catcode`\^^A = 14 % ^^A is comment char
  \catcode`\%=12
  \gdef\UrlBulgarianWikipediaMainPage{^^A
    https://bg.wikipedia.org/wiki/^^A
    %D0%9D%D0%B0%D1%87%D0%B0%D0%BB%D0%BD%D0%B0_^^A
    %D1%81%D1%82%D1%80%D0%B0%D0%BD%D0%B8%D1%86%D0%B0}
\endgroup

\begin{document}
\textit{%
  \expandafter\href\expandafter{%
    \UrlBulgarianWikipediaMainPage
  }{%
    Bulgarian Wikipedia Main Page}%
}
\end{document}
Related Question