[Tex/LaTex] Can’t get unicode to work in \title (inputenc: Unicode char not set up for use with LaTeX)

titlingunicode

I want to use bottom bracket ⎵ in a title:

\documentclass[11pt,a4paper]{article}
\usepackage[utf8]{inputenc}
\usepackage{titling}

\usepackage[lf]{venturis}
\usepackage[T1]{fontenc}

\title{Foo⎵Bar}  % this has the offensive character
\author{Myself}
\date{}

\begin{document}
\maketitle
\end{document}

All I get is

! Package inputenc Error: Unicode char ⎵ (U+23B5)

(inputenc) not set up for use with LaTeX.

Best Answer

Short answer: You can use \DeclareUnicodeCharacter{23B5}{\textvisiblespace}. Or you can use (U+2423 OPEN BOX) , which is already defined as \textvisiblespace, instead of using (U+23B5 BOTTOM SQUARE BRACKET).


Long answer: When you type in your input file and save it with UTF-8 encoding (which is probably the default in your editor), this character U+23B5 BOTTOM SQUARE BRACKET gets stored as the sequence of bytes E2 8E B5. Then when TeX reads the file, because of the \usepackage[utf8]{inputenc} in the preamble, it knows that when it sees those bytes, it should understand it as the character U+23B5. This is great, but quite separately TeX needs to know what to do with that character.

You can in fact give any instruction for any character. In your case it's as simple as simply picking up a symbol and typesetting it. You can look up The Comprehensive LaTeX Symbol List (probably available on your computer with texdoc symbols-a4) and see that the symbol is available (in fact under “Frequently Requested Symbols”) as \textvisiblespace. So you can use that:

\documentclass{article}
\usepackage[utf8]{inputenc}

\usepackage[lf]{venturis}
\usepackage[T1]{fontenc}
\DeclareUnicodeCharacter{23B5}{\textvisiblespace}

\title{Foo⎵Bar}  % this is fine now
\author{Myself}
\date{}

\begin{document}
\maketitle
\end{document}

produces

screenshot from pdflatex

(Instead of \DeclareUnicodeCharacter{23B5}{\textvisiblespace} you can also use \newunicodechar{⎵}{{\textvisiblespace}} from the newunicodechar package.)

In fact, some common definitions come with a typical TeX distribution: specifically, on my computer /usr/local/texlive/2017/texmf-dist/tex/latex/base/utf8.def contains definitions for ©, ®, º, ˆ, ˜, ZWNJ, …, ™, ␣:

\DeclareUnicodeCharacter{00A9}{\textcopyright}
\DeclareUnicodeCharacter{00AA}{\textordfeminine}
\DeclareUnicodeCharacter{00AE}{\textregistered}
\DeclareUnicodeCharacter{00BA}{\textordmasculine}
\DeclareUnicodeCharacter{02C6}{\textasciicircum}
\DeclareUnicodeCharacter{02DC}{\textasciitilde}
\DeclareUnicodeCharacter{200C}{\textcompwordmark}
\DeclareUnicodeCharacter{2026}{\textellipsis}
\DeclareUnicodeCharacter{2122}{\texttrademark}
\DeclareUnicodeCharacter{2423}{\textvisiblespace}

but defining as \textvisiblespace is not one of them.


Alternative: If you use a Unicode-aware TeX engine like XeTeX or LuaTeX (compile your file with xelatex or lualatex rather than pdflatex), then you can simply use the ⎵ character from a system font that contains it. For example:

\documentclass{article}
\usepackage{fontspec}
\setmainfont{FreeSerif} % Sets this font for the entire document
\begin{document}
Foo⎵Bar
\end{document}

or

\documentclass{article}
\usepackage{fontspec}
\newfontfamily{\symbolfont}{Noto Sans Symbols}
\usepackage{newunicodechar}
\newunicodechar{⎵}{{\symbolfont⎵}} % Use this font for this specific character
\begin{document}
Foo⎵Bar
\end{document}

Watch out that TeX has a rather serious (IMO) usability/design flaw here: if the character is missing in the font, then it doesn't cause an error but simply shows a missing character warning in the .log file. You can use \tracinglostchars=2 to have the message shown in the terminal, but you still need to watch out for it.