[Tex/LaTex] Problem after copying text: inputenc Error: Unicode char \u8:‭ not set up for use with LaTeX

errorsinput-encodingsunicode

I have a LaTeX document, that can be compiled without any problems, but after copying some text into the document, I'm getting a huge amount of errors that say:

! Package inputenc Error: Unicode char \u8:­
not set up for use with LaTeX.

I know that this kind of error is quite common, but in my case it is not caused by a single character, but the whole text (which is very long).

The text has been written in Abi Word and saved as UTF-8 encoded text file. The Texmaker Editor uses also UTF-8 encoding. I don't know what could be wrong with that copied text.

Best Answer

Unhappily utf8.def does not show the numerical representation for the missing Unicode character. The missing character <char> is shown directly in macro \u8:<char>. The following example adds the numerical information in the error message:

\documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{stringenc}
\usepackage{pdfescape}

\makeatletter
\renewcommand*{\UTFviii@defined}[1]{%
  \ifx#1\relax
    \begingroup
      % Remove prefix "\u8:"
      \def\x##1:{}%
      % Extract Unicode char from command name
      % (utf8.def does not support surrogates)
      \edef\x{\expandafter\x\string#1}%
      \StringEncodingConvert\x\x{utf8}{utf16be}% convert to UTF-16BE
      % Hexadecimal representation
      \EdefEscapeHex\x\x
      % Enhanced error message
      \PackageError{inputenc}{Unicode\space char\space \string#1\space
                              (U+\x)\MessageBreak
                              not\space set\space up\space
                              for\space use\space with\space LaTeX}\@eha
    \endgroup
  \else\expandafter
    #1%
  \fi
}
\makeatother

\begin{document}
^^c2^^a0 % 7-bit input for U+00A0
\end{document}

Result:

! Package inputenc Error: Unicode char \u8:  (U+00A0)
(inputenc)                not set up for use with LaTeX.