[Tex/LaTex] Country flags unicode char

charactersnewunicodecharunicode

I am trying to define new unicode chars for country flags. Unfortunately flags are encoded using two regional indicator symbols according to the ISO 3166-1 alpha-2 two-letter country codes. So 🇩🇪 for example is 🇩 🇪. (To make it visible here I just added a space between the two characters.)

The problem is, that both \DeclareUnicodeCharacter (pdfTeX) and \newunicodechar (XeTeX and LuaTeX) only accept one char. That's why

\documentclass{article}
\usepackage{newunicodechar}
\newunicodechar{🇩🇪}{\rule{1.3em}{1em}}
\begin{document}
🇩🇪
\end{document}

for example does not work. Can I trick TeX into thinking a combination of two chars is one char? Or are there any other ideas for a workaround?

Best Answer

You can emulate what basically utf8 does:

\documentclass{article}
\usepackage{newunicodechar}
\usepackage{xparse}

\ExplSyntaxOn
\newunicodechar{🇩}{\flags_D:n}
\newunicodechar{🇺}{\flags_U:n}

\cs_new_protected:Nn \flags_D:n
 {
  \str_case:nnF { #1 }
   {
    {🇪}{Germany}
    {🇰}{Denmark}
   }
   {BAD}
 }
\cs_new_protected:Nn \flags_U:n
 {
  \str_case:nnF { #1 }
   {
    {🇰}{United~Kingdom}
    {🇸}{United~States}
   }
   {BAD}
 }
\ExplSyntaxOff

\begin{document}

Here is 🇩🇪

Here is 🇩🇰

Here is 🇺🇰

Here is 🇺🇸

\end{document}

A version that works also with pdflatex (weird errors are to be expected if regional indicator symbols not appear in pairs.

\documentclass{article}
\usepackage{ifxetex}
\ifxetex\else
  \usepackage[utf8]{inputenc}
\fi
\usepackage{newunicodechar}
\usepackage{xparse}

\ExplSyntaxOn
\newunicodechar{🇩}{ \flags_print:n {D} }
\newunicodechar{🇺}{ \flags_print:n {U} }

\bool_if:nTF { \sys_if_engine_luatex_p: || \sys_if_engine_xetex_p: }
 {
  \cs_new:Nn \flags_print:n
   {
    \flags_print_unicode:nn { #1 }
   }
 }
 {
  \cs_new:Nn \flags_print:n
   {
    \peek_charcode:NTF ^^f0
     {
      \flags_print_eightbit:nnnnn { #1 }
     }
     {
      BAD
     }
   }
 }

\cs_new_protected:Nn \flags_print_unicode:nn
 {
  \use:c { flags_#1:n } { #2 }
 }

\cs_new_protected:Nn \flags_print_eightbit:nnnnn
 {
  \use:c { flags_#1:n } { #2#3#4#5 }
 }

\cs_new_protected:Nn \flags_D:n
 {
  \str_case:nnF { #1 }
   {
    {🇪}{Germany}
    {🇰}{Denmark}
   }
   {BAD}
 }
\cs_new_protected:Nn \flags_U:n
 {
  \str_case:nnF {#1}
   {
    {🇰}{United~Kingdom}
    {🇸}{United~States}
   }
   {BAD}
 }
\ExplSyntaxOff

\begin{document}

Here is 🇩🇪

Here is 🇩🇰

Here is 🇺🇰

Here is 🇺🇸

\end{document}

It would be possible to add a check on the next character also for Unicode engines, but it seems more crucial for pdflatex.

Related Solutions

[Tex/LaTex] Package inputenc Error: Unicode char ẁ (U+1E81)

The ẁ character is not supported in Latin-1; save your file as UTF-8 and add

\usepackage[utf8]{inputenc}

and also

\DeclareUnicodeCharacter{1E81}{\`w}

Full example:

\documentclass{article}
\usepackage[utf8]{inputenc}

\DeclareUnicodeCharacter{1E81}{\`w}

\begin{document}

Is the character ẁ used in some language?

\end{document}

[Tex/LaTex] Insert Unicode private char

If the engine is Unicode aware and a font is used, which contains the glyph for the private Unicode code point:

^^^^e25f

See: The ^^ notation in various engines. This is TeX's method to encode non-ASCII characters with ASCII and can also be used inside command tokens.

There are also commands to select a character by slot in the current font:

LaTeX command:

\symbol{"E25F}

or the (plain) TeX variant:

\char"E25F\relax

BTW, there is a non-private code point for this glyph:

U+27A2 Three-D Top-Lighted Rightwards Arrowhead

\documentclass{article}
\usepackage{fontspec}
\begin{document}
\newcommand*{\test}[1]{%
  \begingroup\fontspec{#1}\symbol{"27A2}\endgroup
}
\test{DejaVu Sans}
\test{Segoe UI Symbol}
\end{document}

Best Answer

Related Solutions

[Tex/LaTex] Package inputenc Error: Unicode char ẁ (U+1E81)

[Tex/LaTex] Insert Unicode private char

Related Question