[Tex/LaTex] How to input accents in PlainTeX with UTF-8 encoding

accentsplain-texunicode

I'm using TeX with the plain format. On my keyboard there is the 'è' character (as well as é, ò, à, ù, ì). I'd like to make it work so that, when put è in the input file, TeX transforms it into \`e, i.e. the letter e with an accent. I think I have to change the character code of è and do something like that, but I don't know what to do exactly. Can you help me?

Best Answer

If you want to use Knuth TeX you'll have a hard time. With pdftex it's easier, because there are some useful features coming from e-TeX extensions.

Here's a seemingly working setup (I add only a reduced version of the first file, for the limitation in characters here.

utfplainmac.tex

% -*- coding: utf-8 -*-
% We set a safe catcode for ^ and ^^^; XeTeX uses the ^^^^ convention for
% specifying arbitrary 16 bit code points. So if XeTeX is used, \gobble
% eats up ^^^^0021, while with an 8 bit engine only ^^^ is
% swallowed and \next is not \empty. In the end, \ifunicode is \iftrue if
% the engine is Unicode aware, it is \iffalse if the engine is 8 bit.

\catcode`\^=7
\catcode`\~=\active 

\newif\ifunicodeengine
\begingroup
\catcode30=12 % just in case: 30 is `\^^^
\def\gobble#1#2{}
\edef\next{\gobble^^^^0021}
\expandafter\endgroup
\ifx\next\empty\unicodeenginetrue\else\unicodeenginefalse\fi

\message{Engine is \ifunicodeengine Unicode aware\else 8 bit\fi, loading UTF-8 combinations}

\ifunicodeengine
%%% Make the first argument active and define it as the fourth
%%% The trick avoids a global definition: the \lowercase changes
%%% ~ into #1 as active character; then \endgroup\def#1 is put
%%% back into the token stream (here #1 stands for the actual
%%% character given as argument); the same trick is used for
%%% \UseUnicodeCharacter, which must have an argument expressed
%%% as a four digit hexadecimal number (with uppercase A..F).
  \def\DoUTFCombination#1#2#3#4{\catcode"#1\active
    \begingroup\lccode`~="#1\lowercase{\endgroup\def~}{#4}}
  \def\UseUnicodeCharacter#1{\begingroup\lccode`~="#1\lowercase{\endgroup~}}
\else
%%% The UTF-8 prefixes are made active; they just look at the
%%% following token, which is a category 12 character unless something
%%% strange has happened, and forms with it a control sequence that
%%% will be defined later
  \catcode`\^^c2=\active
  \def^^c2#1{\csname UTFprefix-c2#1\endcsname}
  \catcode`\^^c3=\active
  \def^^c3#1{\csname UTFprefix-c3#1\endcsname}
  \catcode`\^^c4=\active
  \def^^c4#1{\csname UTFprefix-c4#1\endcsname}
  \catcode`\^^c5=\active
  \def^^c5#1{\csname UTFprefix-c5#1\endcsname}
  \catcode`\^^c6=\active
  \def^^c6#1{\csname UTFprefix-c6#1\endcsname}
  \catcode`\^^c7=\active
  \def^^c7#1{\csname UTFprefix-c7#1\endcsname}
  \catcode`\^^c8=\active
  \def^^c8#1{\csname UTFprefix-c8#1\endcsname}
  \catcode`\^^cb=\active
  \def^^cb#1{\csname UTFprefix-cb#1\endcsname}

%%% If the file is input by a UTF-8 unaware engine, we define the main
%%% command that associates the UTF-8 character (actually a two byte
%%% combination) to a list of tokens; we define also
%%% \UseUnicodeCharacter to access the same replacement text via an
%%% auxiliary macro \UTFCodePoint-xxxx, where xxxx stands for the
%%% argument to \UseUnicodeCharacter, a four digit hexadecimal number
%%% (uppercase A..F).
  \def\DoUTFCombination#1#2#3#4{%
    \expandafter\def\csname UTFprefix-#2#3\endcsname{#4}%
    \expandafter\def\csname UTFCodePoint-#1\endcsname{#4}%
  }
  \def\UseUnicodeCharacter#1{\csname UTFCodePoint-#1\endcsname}
\fi

%%% Some (actually many) UTF-8 characters cannot be printed with T1
%%% or TS1 encoded fonts
\newif\ifUTFwarning \UTFwarningtrue
\def\BadUTF#1{%
  \ifUTFwarning
    \global\UTFwarningfalse
    \errhelp{Look in the log file for unsupported characters}%
    \errmessage{Unsupported UTF character}%
  \fi
  \wlog{Character #1 not currently supported on line
    \the\inputlineno}%
}

%%% A shorthand for choosing the text companion font
\def\tcsym#1{{\tcfont\char#1}}

%%% The list of characters: Unicode code point, prefix and second
%%% byte, then the definition.
\DoUTFCombination{00A0}{c2}{^^a0}{~} % NO-BREAK SPACE
\DoUTFCombination{00A1}{c2}{^^a1}{!`} % INVERTED EXCLAMATION MARK
\DoUTFCombination{00A2}{c2}{^^a2}{\tcsym{"8B}} % CENT SIGN
\DoUTFCombination{00A3}{c2}{^^a3}{\pound} % POUND SIGN
\DoUTFCombination{00A4}{c2}{^^a4}{\tcsym{"A4}} % CURRENCY SIGN
\DoUTFCombination{00A5}{c2}{^^a5}{\tcsym{"A5}} % YEN SIGN
\DoUTFCombination{00A6}{c2}{^^a6}{\tcsym{"A6}} % BROKEN BAR
\DoUTFCombination{00A7}{c2}{^^a7}{\tcsym{"A7}} % SECTION SIGN
\DoUTFCombination{00A8}{c2}{^^a8}{\"{}} % DIAERESIS
\DoUTFCombination{00A9}{c2}{^^a9}{\tcsym{"A9}} % COPYRIGHT SIGN
\DoUTFCombination{00AA}{c2}{^^aa}{\tcsym{"AA}} % FEMININE ORDINAL INDICATOR
\DoUTFCombination{00AB}{c2}{^^ab}{>} % RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
\DoUTFCombination{00BC}{c2}{^^bc}{\tcsym{"BC}} % VULGAR FRACTION ONE QUARTER
\DoUTFCombination{00BD}{c2}{^^bd}{\tcsym{"BD}} % VULGAR FRACTION ONE HALF
\DoUTFCombination{00BE}{c2}{^^be}{\tcsym{"BE}} % VULGAR FRACTION THREE QUARTERS
\DoUTFCombination{00BF}{c2}{^^bf}{?`} % INVERTED QUESTION MARK

\DoUTFCombination{00C0}{c3}{^^80}{\`A} % LATIN CAPITAL LETTER A WITH GRAVE
\DoUTFCombination{00C1}{c3}{^^81}{\'A} % LATIN CAPITAL LETTER A WITH ACUTE
\DoUTFCombination{00C2}{c3}{^^82}{\^A} % LATIN CAPITAL LETTER A WITH CIRCUMFLEX
\DoUTFCombination{00C3}{c3}{^^83}{\~A} % LATIN CAPITAL LETTER A WITH TILDE
\DoUTFCombination{00C4}{c3}{^^84}{\"A} % LATIN CAPITAL LETTER A WITH DIAERESIS
\DoUTFCombination{00C5}{c3}{^^85}{\AA} % LATIN CAPITAL LETTER A WITH RING ABOVE
\DoUTFCombination{00C6}{c3}{^^86}{\AE} % LATIN CAPITAL LETTER AE
\DoUTFCombination{00C7}{c3}{^^87}{\c{C}} % LATIN CAPITAL LETTER C WITH CEDILLA
\DoUTFCombination{00C8}{c3}{^^88}{\`E} % LATIN CAPITAL LETTER E WITH GRAVE
\DoUTFCombination{00C9}{c3}{^^89}{\'E} % LATIN CAPITAL LETTER E WITH ACUTE
\DoUTFCombination{00CA}{c3}{^^8a}{\^E} % LATIN CAPITAL LETTER E WITH CIRCUMFLEX
\DoUTFCombination{00CB}{c3}{^^8b}{\"E} % LATIN CAPITAL LETTER E WITH DIAERESIS
\DoUTFCombination{00CC}{c3}{^^8c}{\`I} % LATIN CAPITAL LETTER I WITH GRAVE
\DoUTFCombination{00CD}{c3}{^^8d}{\'I} % LATIN CAPITAL LETTER I WITH ACUTE
\DoUTFCombination{00CE}{c3}{^^8e}{\^I} % LATIN CAPITAL LETTER I WITH CIRCUMFLEX
\DoUTFCombination{00CF}{c3}{^^8f}{\"I} % LATIN CAPITAL LETTER I WITH DIAERESIS
\DoUTFCombination{00D0}{c3}{^^90}{\DH} % LATIN CAPITAL LETTER ETH
\DoUTFCombination{00D1}{c3}{^^91}{\~N} % LATIN CAPITAL LETTER N WITH TILDE
\DoUTFCombination{00D2}{c3}{^^92}{\`O} % LATIN CAPITAL LETTER O WITH GRAVE
\DoUTFCombination{00D3}{c3}{^^93}{\'O} % LATIN CAPITAL LETTER O WITH ACUTE
\DoUTFCombination{00D4}{c3}{^^94}{\^O} % LATIN CAPITAL LETTER O WITH CIRCUMFLEX
\DoUTFCombination{00D5}{c3}{^^95}{\~O} % LATIN CAPITAL LETTER O WITH TILDE
\DoUTFCombination{00D6}{c3}{^^96}{\"O} % LATIN CAPITAL LETTER O WITH DIAERESIS
\DoUTFCombination{00D7}{c3}{^^97}{\tcsym{"D6}} % MULTIPLICATION SIGN
\DoUTFCombination{00D8}{c3}{^^98}{\O} % LATIN CAPITAL LETTER O WITH STROKE
\DoUTFCombination{00D9}{c3}{^^99}{\`U} % LATIN CAPITAL LETTER U WITH GRAVE
\DoUTFCombination{00DA}{c3}{^^9a}{\'U} % LATIN CAPITAL LETTER U WITH ACUTE
\DoUTFCombination{00DB}{c3}{^^9b}{\^U} % LATIN CAPITAL LETTER U WITH CIRCUMFLEX
\DoUTFCombination{00DC}{c3}{^^9c}{\"U} % LATIN CAPITAL LETTER U WITH DIAERESIS
\DoUTFCombination{00DD}{c3}{^^9d}{\'Y} % LATIN CAPITAL LETTER Y WITH ACUTE
\DoUTFCombination{00DE}{c3}{^^9e}{\TH} % LATIN CAPITAL LETTER THORN
\DoUTFCombination{00DF}{c3}{^^9f}{\ss} % LATIN SMALL LETTER SHARP S
\DoUTFCombination{00E0}{c3}{^^a0}{\`a} % LATIN SMALL LETTER A WITH GRAVE
\DoUTFCombination{00E1}{c3}{^^a1}{\'a} % LATIN SMALL LETTER A WITH ACUTE
\DoUTFCombination{00E2}{c3}{^^a2}{\^a} % LATIN SMALL LETTER A WITH CIRCUMFLEX
\DoUTFCombination{00E3}{c3}{^^a3}{\~a} % LATIN SMALL LETTER A WITH TILDE
\DoUTFCombination{00E4}{c3}{^^a4}{\"a} % LATIN SMALL LETTER A WITH DIAERESIS
\DoUTFCombination{00E5}{c3}{^^a5}{\aa} % LATIN SMALL LETTER A WITH RING ABOVE
\DoUTFCombination{00E6}{c3}{^^a6}{\ae} % LATIN SMALL LETTER AE
\DoUTFCombination{00E7}{c3}{^^a7}{\c{c}} % LATIN SMALL LETTER C WITH CEDILLA
\DoUTFCombination{00E8}{c3}{^^a8}{\`e} % LATIN SMALL LETTER E WITH GRAVE
\DoUTFCombination{00E9}{c3}{^^a9}{\'e} % LATIN SMALL LETTER E WITH ACUTE
\DoUTFCombination{00EA}{c3}{^^aa}{\^e} % LATIN SMALL LETTER E WITH CIRCUMFLEX
\DoUTFCombination{00EB}{c3}{^^ab}{\"e} % LATIN SMALL LETTER E WITH DIAERESIS
\DoUTFCombination{00EC}{c3}{^^ac}{\`i} % LATIN SMALL LETTER I WITH GRAVE
\DoUTFCombination{00ED}{c3}{^^ad}{\'i} % LATIN SMALL LETTER I WITH ACUTE
\DoUTFCombination{00EE}{c3}{^^ae}{\^i} % LATIN SMALL LETTER I WITH CIRCUMFLEX
\DoUTFCombination{00EF}{c3}{^^af}{\"i} % LATIN SMALL LETTER I WITH DIAERESIS
\DoUTFCombination{00F0}{c3}{^^b0}{\dh} % LATIN SMALL LETTER ETH
\DoUTFCombination{00F1}{c3}{^^b1}{\~n} % LATIN SMALL LETTER N WITH TILDE
\DoUTFCombination{00F2}{c3}{^^b2}{\`o} % LATIN SMALL LETTER O WITH GRAVE
\DoUTFCombination{00F3}{c3}{^^b3}{\'o} % LATIN SMALL LETTER O WITH ACUTE
\DoUTFCombination{00F4}{c3}{^^b4}{\^o} % LATIN SMALL LETTER O WITH CIRCUMFLEX
\DoUTFCombination{00F5}{c3}{^^b5}{\~o} % LATIN SMALL LETTER O WITH TILDE
\DoUTFCombination{00F6}{c3}{^^b6}{\"o} % LATIN SMALL LETTER O WITH DIAERESIS
\DoUTFCombination{00F7}{c3}{^^b7}{\tcsym{"F6}} % DIVISION SIGN
\DoUTFCombination{00F8}{c3}{^^b8}{\o} % LATIN SMALL LETTER O WITH STROKE
\DoUTFCombination{00F9}{c3}{^^b9}{\`u} % LATIN SMALL LETTER U WITH GRAVE
\DoUTFCombination{00FA}{c3}{^^ba}{\'u} % LATIN SMALL LETTER U WITH ACUTE
\DoUTFCombination{00FB}{c3}{^^bb}{\^u} % LATIN SMALL LETTER U WITH CIRCUMFLEX
\DoUTFCombination{00FC}{c3}{^^bc}{\"u} % LATIN SMALL LETTER U WITH DIAERESIS
\DoUTFCombination{00FD}{c3}{^^bd}{\'y} % LATIN SMALL LETTER Y WITH ACUTE
\DoUTFCombination{00FE}{c3}{^^be}{\th} % LATIN SMALL LETTER THORN
\DoUTFCombination{00FF}{c3}{^^bf}{\"y} % LATIN SMALL LETTER Y WITH DIAERESIS

%%%% Other characters omitted

\endinput

plain-t1.tex

\catcode`@=11

\input utfplainmac

\message{Loading EC fonts}

\font\tenrm=ecrm1000 % roman text
\font\tctenrm=tcrm1000
% \font\sevenrm=ecrm0700
% \font\fiverm=ecrm0500

\font\tenbf=ecbx1000 % boldface extended
\font\tctenbf=tcbx1000
% \font\sevenbf=ecbx0700
% \font\fivebf=ecbx0500

\font\tentt=ectt1000 % typewriter
\font\tctentt=tctt1000

\font\tensl=ecsl1000 % slanted roman
\font\tctensl=tcsl1000

\font\tenit=ecti1000 % text italic
\font\tctenit=tcti1000

% \font\tenrm=ptmr8t % roman text
% \font\sevenrm=ptmr8t at 7pt
% \font\fiverm=ptmr8t at 5pt

% \font\tenbf=ptmb8t % boldface extended
% \font\sevenbf=ptmb8t at 7pt
% \font\fivebf=ptmb8t at 5pt

% \font\tentt=pcrr8t % typewriter

% \font\tensl=ptmro8t % slanted roman

% \font\tenit=ptmri8t % text italic

% \textfont0=\tenrm \scriptfont0=\sevenrm \scriptscriptfont0=\fiverm
% \textfont1=\teni \scriptfont1=\seveni \scriptscriptfont1=\fivei
% \textfont2=\tensy \scriptfont2=\sevensy \scriptscriptfont2=\fivesy
% \textfont3=\tenex \scriptfont3=\tenex \scriptscriptfont3=\tenex
% \textfont\itfam=\tenit
% \textfont\slfam=\tensl
% \textfont\bffam=\tenbf \scriptfont\bffam=\sevenbf
%   \scriptscriptfont\bffam=\fivebf
% \textfont\ttfam=\tentt

\def\rm{\fam\z@\let\tcfont\tctenrm\tenrm}
\def\it{\fam\itfam\let\tcfont\tctenit\tenit}
\def\sl{\fam\slfam\let\tcfont\tctensl\tensl}
\def\bf{\fam\bffam\let\tcfont\tctenbf\tenbf}
\def\tt{\fam\ttfam\let\tcfont\tctentt\tentt}

% set the font
\rm

\catcode`\@=11

% special characters
\chardef\pound="BF
\chardef\IJ="9C
\chardef\ij="BC
\chardef\L="8A
\chardef\l="AA
\chardef\DH="D0
\chardef\dh="F0
\chardef\TH="DE
\chardef\th="FE
\chardef\NG="8D
\chardef\ng="AD
\chardef\AA="C5
\chardef\aa="E5
\chardef\AE="C6
\chardef\ae="E6
\chardef\OE="D7
\chardef\oe="F7
\chardef\O="D8
\chardef\o="F8
\chardef\SS="DF
\chardef\ss="FF
\chardef\i="19
\chardef\j="1A
\let\DJ=\DH
\chardef\dj="9E

\def\@firstoftwo#1#2{#1}
\def\@secondoftwo#1#2{#2}
\def\@ifundefined#1{\expandafter\ifx\csname#1\endcsname\relax
  \expandafter\@firstoftwo\else\expandafter\@secondoftwo\fi}

%%% \make@ec@accent is syntactic sugar; for example
%%% \make@ec@accent\x{abc} is equivalent to
%%%
%%% \def\x#1{\@ifundefined{ec@abc@\detokenize{#1}}
%%%   {\csname ec@abc\endcsname{#1}}{\csname ec@abc@#1\endcsname}}}
%%%
%%% Thus a call like \x{y} looks whether \ec@abc@y is defined; if it
%%% is, then use it, otherwise resort to \ec@abc{y}, where \ec@abc is
%%% the general accent command. In this way we can define \x{y} to
%%% print a single character, for hyphenation purposes, for example.
\def\make@ec@accent#1#2{%
  \def#1##1{\@ifundefined{ec@#2@\detokenize{##1}}
    {\csname ec@#2\endcsname{##1}}{\csname ec@#2@##1\endcsname}}}
\make@ec@accent\`{grave}
\make@ec@accent\'{acute}
\make@ec@accent\^{circumflex}
\make@ec@accent\~{tilde}
\make@ec@accent\"{dieresis}
\make@ec@accent\H{doubleacute}
\make@ec@accent\r{ring}
\make@ec@accent\v{caron}
\make@ec@accent\u{breve}
\make@ec@accent\={macron}
\make@ec@accent\.{dotabove}
\make@ec@accent\c{cedilla}
\make@ec@accent\k{ogonek}

%%% Now we define the accents; for example \ec@grave is defined as it
%%% is \` in Plain TeX, except for the code point of the accent. But
%%% we define also \ec@grave@A to print just a character which will
%%% then participate to hyphenation and kerning. The same for all
%%% other characters which are available in T1 encoded fonts. In some
%%% special cases we provide also some complicated definition, to
%%% cover peculiar situation (like \c{g}, where the cedilla should go
%%% over the g).

% grave accent
\def\ec@grave#1{{\accent"0 #1}}
\chardef\ec@grave@A="C0
\chardef\ec@grave@a="E0
\chardef\ec@grave@E="C8
\chardef\ec@grave@e="E8
\chardef\ec@grave@I="CC
\chardef\ec@grave@i="EC
\chardef\ec@grave@O="D2
\chardef\ec@grave@o="F2
\chardef\ec@grave@U="D9
\chardef\ec@grave@u="F9

% acute accent
\def\ec@acute#1{{\accent"1 #1}}
\chardef\ec@acute@A="C1
\chardef\ec@acute@a="E1
\chardef\ec@acute@E="C9
\chardef\ec@acute@e="E9
\chardef\ec@acute@I="CD
\chardef\ec@acute@i="ED
\chardef\ec@acute@C="82
\chardef\ec@acute@c="A2
\chardef\ec@acute@L="88
\chardef\ec@acute@l="A8
\chardef\ec@acute@N="8B
\chardef\ec@acute@n="AB
\chardef\ec@acute@O="D3
\chardef\ec@acute@o="F3
\chardef\ec@acute@R="8F
\chardef\ec@acute@r="AF
\chardef\ec@acute@S="91
\chardef\ec@acute@s="B1
\chardef\ec@acute@U="DA
\chardef\ec@acute@u="FA
\chardef\ec@acute@Z="99
\chardef\ec@acute@z="B9

% circumflex accent
\def\ec@circumflex#1{{\accent"2 #1}}
\chardef\ec@circumflex@A="C2
\chardef\ec@circumflex@a="E2
\chardef\ec@circumflex@E="CA
\chardef\ec@circumflex@e="EA
\chardef\ec@circumflex@I="CE
\chardef\ec@circumflex@i="EE
\chardef\ec@circumflex@O="D4
\chardef\ec@circumflex@o="F4
\chardef\ec@circumflex@U="DB
\chardef\ec@circumflex@u="FB

% tilde accent
\def\ec@tilde#1{{\accent"3 #1}}
\chardef\ec@tilde@A="C3
\chardef\ec@tilde@a="E3
\chardef\ec@tilde@N="D1
\chardef\ec@tilde@n="F1
\chardef\ec@tilde@O="D5
\chardef\ec@tilde@o="F5

% dieresis
\def\ec@dieresis#1{{\accent"4 #1}}
\chardef\ec@dieresis@A="C4
\chardef\ec@dieresis@a="E4
\chardef\ec@dieresis@E="CB
\chardef\ec@dieresis@e="EB
\chardef\ec@dieresis@I="CF
\chardef\ec@dieresis@i="EF
\chardef\ec@dieresis@O="D6
\chardef\ec@dieresis@o="F6
\chardef\ec@dieresis@U="DC
\chardef\ec@dieresis@u="FC
\chardef\ec@dieresis@Y="98
\chardef\ec@dieresis@y="A8

% double acute (hungarian umlaut)
\def\ec@doubleacute#1{{\accent"5 #1}}
\chardef\ec@doubleacute@O="8E
\chardef\ec@doubleacute@o="AE
\chardef\ec@doubleacute@U="97
\chardef\ec@doubleacute@u="B7

% ring
\def\ec@ring#1{{\accent"6 #1}}
% \chardef\ec@ring@A="C5
% \chardef\ec@ring@a="E5
\chardef\ec@ring@U="97
\chardef\ec@ring@u="B7

% caron
\def\ec@caron#1{{\accent"7 #1}}
\chardef\ec@caron@C="83
\chardef\ec@caron@c="A3
\chardef\ec@caron@D="84
\chardef\ec@caron@d="A4
\chardef\ec@caron@E="85
\chardef\ec@caron@e="A5
\chardef\ec@caron@L="89
\chardef\ec@caron@l="A9
\chardef\ec@caron@N="8C
\chardef\ec@caron@n="AC
\chardef\ec@caron@R="90
\chardef\ec@caron@r="B0
\chardef\ec@caron@S="92
\chardef\ec@caron@s="B2
\chardef\ec@caron@T="94
\chardef\ec@caron@t="B4
\chardef\ec@caron@Z="9A
\chardef\ec@caron@z="BA

% breve
\def\ec@breve#1{{\accent"8 #1}}
\chardef\ec@breve@G="87
\chardef\ec@breve@g="A7

% macron
\def\ec@macron#1{{\accent"9 #1}}

% dot above
\def\ec@dotabove#1{{\accent"A #1}}
\chardef\ec@dotabove@Z="9B
\chardef\ec@dotabove@z="BB

% cedilla
\def\ec@cedilla#1{{\setbox\z@\hbox{#1}\ifdim\ht\z@=1ex\accent"0B #1%
  \else\ooalign{\unhbox\z@\crcr\hidewidth\char"0B\hidewidth}\fi}}
\chardef\ec@cedilla@C="C7
\chardef\ec@cedilla@c="E7
\chardef\ec@cedilla@S="93
\chardef\ec@cedilla@s="B3
\chardef\ec@cedilla@T="95
\chardef\ec@cedilla@t="B5
\def\ec@cedilla@g{\accent`\`g}

% ogonek
\def\ec@ogonek#1{{\ooalign{\null#1\crcr\hidewidth\char"0C\hidewidth}}}
\chardef\ec@ogonek@A="81
\chardef\ec@ogonek@a="A1
\chardef\ec@ogonek@E="86
\chardef\ec@ogonek@e="A6
%%% lowercase u is special
\def\ec@ogonek@u{{\ooalign{\null u\crcr\hidewidth\char"0C}}}

% bar under
\def\b#1{{\o@lign{\relax#1\crcr\hidewidth\sh@ft{-3ex}%
  \vbox to.2ex{\hbox{\char"09}\vss}\hidewidth}}}

%%% A special purpose macro
% catalan dot
\def\c@talandot#1{\kern#1em\llap{$\m@th\cdot$}\kern-#1em}
\def\Lmiddledot{L\c@talandot{-.1}}
\def\lmiddledot{l\c@talandot{.15}}

\catcode`@=12

\endinput

test.tex

% -*- coding: utf-8 -*-
\input plain-t1

Here are some characters: $\alpha \Gamma$ æ Æ \'x

Ǎ Ǹ ă ŭ ā ē ü ł Ł ý ß Ş Ģ \c{g} ø ǽ Ǽ Ů ů ¡ ¿ ę Ą Ǫ ǫ Ų ų Į į

ĿL Ŀl ŀl

Český Krumlov (německy Böhmisch Krumau, popřípadě Krummau) je okresní
město v Jihočeském kraji, zhruba 25 km jižně od Českých
Budějovic. Rozkládá se pod hřebenem Blanského lesa a protéká jím řeka
Vltava. Jedná se o významné turistické centrum Jižních Čech.
Středověké centrum města, které obklopuje meandry Vltavy, je od roku
1963 městskou památkovou rezervací a od roku 1992 je zapsáno na
seznamu světového dědictví UNESCO. V roce 2003 bylo městskou
památkovou zónou vyhlášeno předměstí Plešivec (jižně od historického
jádra).

Český Krumlov é uma cidade República Checa na lista da UNESCO como
Patrimônio da Humanidade. Se encontra na Boêmia do Sul (região), é a
capital antiga da região de Rosenberg, a nobreza mais rica e influente
do país. A construção da cidade e seu Castelo começou no Século
XIII. A população da cidade em 2005 era de 13942 habitantes, e a área
de uns 22 km².

Český Krumlov (13.942 abitanti) è una città della Boemia meridionale,
in Repubblica Ceca, molto conosciuta per la raffinata architettura del
centro storico e per il Castello. Era conosciuta come Krumau fino alla
Seconda guerra mondiale quando alla fine furono espulsi gli abitanti
di lingua tedesca.  Český Krumlov letteralmente significa ``Krumlov
Ceca (Boema)''; ne esiste infatti anche una morava.

\UseUnicodeCharacter{00C8}

\bye

The omitted list (messages here are limited to 30000 characters) shouldn't cause the test example to go wrong.

Compile test.tex with pdftex.

I have also a plain-cmu.tex file that sets up fonts to use the CMUnicode OpenType fonts.

enter image description here

Related Question