[Tex/LaTex] Country flags unicode char

charactersnewunicodecharunicode

I am trying to define new unicode chars for country flags. Unfortunately flags are encoded using two regional indicator symbols according to the ISO 3166-1 alpha-2 two-letter country codes. So πŸ‡©πŸ‡ͺ for example is πŸ‡© πŸ‡ͺ. (To make it visible here I just added a space between the two characters.)

The problem is, that both \DeclareUnicodeCharacter (pdfTeX) and \newunicodechar (XeTeX and LuaTeX) only accept one char. That's why

\documentclass{article}
\usepackage{newunicodechar}
\newunicodechar{πŸ‡©πŸ‡ͺ}{\rule{1.3em}{1em}}
\begin{document}
πŸ‡©πŸ‡ͺ
\end{document}

for example does not work. Can I trick TeX into thinking a combination of two chars is one char? Or are there any other ideas for a workaround?

Best Answer

You can emulate what basically utf8 does:

\documentclass{article}
\usepackage{newunicodechar}
\usepackage{xparse}

\ExplSyntaxOn
\newunicodechar{πŸ‡©}{\flags_D:n}
\newunicodechar{πŸ‡Ί}{\flags_U:n}

\cs_new_protected:Nn \flags_D:n
 {
  \str_case:nnF { #1 }
   {
    {πŸ‡ͺ}{Germany}
    {πŸ‡°}{Denmark}
   }
   {BAD}
 }
\cs_new_protected:Nn \flags_U:n
 {
  \str_case:nnF { #1 }
   {
    {πŸ‡°}{United~Kingdom}
    {πŸ‡Έ}{United~States}
   }
   {BAD}
 }
\ExplSyntaxOff

\begin{document}

Here is πŸ‡©πŸ‡ͺ

Here is πŸ‡©πŸ‡°

Here is πŸ‡ΊπŸ‡°

Here is πŸ‡ΊπŸ‡Έ

\end{document}

enter image description here

A version that works also with pdflatex (weird errors are to be expected if regional indicator symbols not appear in pairs.

\documentclass{article}
\usepackage{ifxetex}
\ifxetex\else
  \usepackage[utf8]{inputenc}
\fi
\usepackage{newunicodechar}
\usepackage{xparse}

\ExplSyntaxOn
\newunicodechar{πŸ‡©}{ \flags_print:n {D} }
\newunicodechar{πŸ‡Ί}{ \flags_print:n {U} }

\bool_if:nTF { \sys_if_engine_luatex_p: || \sys_if_engine_xetex_p: }
 {
  \cs_new:Nn \flags_print:n
   {
    \flags_print_unicode:nn { #1 }
   }
 }
 {
  \cs_new:Nn \flags_print:n
   {
    \peek_charcode:NTF ^^f0
     {
      \flags_print_eightbit:nnnnn { #1 }
     }
     {
      BAD
     }
   }
 }

\cs_new_protected:Nn \flags_print_unicode:nn
 {
  \use:c { flags_#1:n } { #2 }
 }

\cs_new_protected:Nn \flags_print_eightbit:nnnnn
 {
  \use:c { flags_#1:n } { #2#3#4#5 }
 }

\cs_new_protected:Nn \flags_D:n
 {
  \str_case:nnF { #1 }
   {
    {πŸ‡ͺ}{Germany}
    {πŸ‡°}{Denmark}
   }
   {BAD}
 }
\cs_new_protected:Nn \flags_U:n
 {
  \str_case:nnF {#1}
   {
    {πŸ‡°}{United~Kingdom}
    {πŸ‡Έ}{United~States}
   }
   {BAD}
 }
\ExplSyntaxOff

\begin{document}

Here is πŸ‡©πŸ‡ͺ

Here is πŸ‡©πŸ‡°

Here is πŸ‡ΊπŸ‡°

Here is πŸ‡ΊπŸ‡Έ

\end{document}

It would be possible to add a check on the next character also for Unicode engines, but it seems more crucial for pdflatex.