\usepackage{newunicodechar}
\newunicodechar{ }{ }
In the first argument you put a NO-BREAK SPACE (U+00A0), in the second a normal space. A better definition would be
\newunicodechar{ }{~}
(again the space is NO-BREAK SPACE), so this unbreakable space will stretch or shrink wit the other spaces in the line. Of course use the first one if you want a normal space, ça va sans dire. :)
There are a couple of problems:
There is already an action defined for ¦
, precisely \IeC{\textbrokenbar}
, which is kind of expected; thus \newcommand
will give you the error.
If you do
\expandafter\newcommand\csname u8:\detokenize{∙}\endcsname{\kern1pt}
you're not defining the macro \∙
, but a meaning for the Unicode character ∙
. Since ∙
is represented in UTF-8 by the triple E2 88 99
, TeX will see \^^e2
and the error message uses some representation of the three bytes.
With newunicodechar
you don't have to do anything special:
% -*- coding: utf-8 -*-
\documentclass[11pt,english]{article}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{textcomp}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{graphicx}
\usepackage{babel}
\usepackage{newunicodechar}
\newunicodechar{¦}{\kern20pt} % exaggerated to show the effect
\begin{document}
A¦A
\end{document}
The output is
and the log file will report
Package newunicodechar Warning: Redefining Unicode character on input line 11.
which would be
Package newunicodechar Warning: Redefining Unicode character; it meant
(newunicodechar) *** \IeC {\textbrokenbar } ***
(newunicodechar) before your redefinition on input line 11.
if the verbose
option is used (\usepackage[verbose]{newunicodechar}
).
Here's the relevant part from the documentation of newunicodechar
.
The package provides only one command, \newunicodechar
, which must be called with two arguments:
\newunicodechar{<char>}{<code>}
where <char>
is the Unicode character to which we need to give a meaning and
<code>
is that meaning, that is the LaTeX code that will be substituted to the
character.
Best Answer
In PDFLaTeX and the
latex
command on modern distributions, they are the same. Both evaluate to\nobreakspace
. In LuaLaTeX and XeLaTeX, they are different by default, but you can change that.The
inputenc
package parses the no-break space character (in each encoding that has it) as\nobreakspace
. In the Latin-1 encoding, for example, the definition isAnd for the default, UTF-8, it is
The LaTeX kernel also makes
~
an active character, defined asIn LuaLaTeX or XeLaTeX,
~
still evaluates to\nobreakspace
, which is defined in the LaTeX kernel asHowever, the character U+00A0 is interpreted literally. (Although it still searches and copies from the PDF as a space character.) You can clearly see the difference with the test file
In particular, U+00A0 is a fixed width set by the font, and
\nobreakspace
uses the same interword spacing as the rest of the line—so you might want the fixed-width non-breaking space for a monospace font. The no-break space character,^^a0
,\symbol{"A0}
and\char"A0
all give the same output.However, you could redefine U+00A0 to evaluate to
\nobreakspace
: