[Tex/LaTex] What are lccode and uccode used for

luatexpdftextex-corexetex

In TeX, each of the 256 bytes has an associated \lccode and an \uccode, integers in the range [0,255] which indicate among other things how \lowercase and \uppercase act. There are of course a bunch of other numbers (mathcode, and catcode for instance), but I am focusing here on case-changing codes.

A look at the TeXbook tells me about the following uses of the \lccode and \uccode:

  • \lowercase <general text> turns each character token in the argument into a character token with the same category code, but a character code equal to the \lccode of the original character code, unless the \lccode is zero, in which case, the original character code is retained.

  • \uppercase <general text> behaves in the same way, using the \uccode instead.

  • When hyphenating, TeX takes whatever characters reached its stomach (so either from tokens with category code 11 or 12, or from chardef'd tokens, or char), and defines a "letter" to be a character with non-zero \lccode. A letter is lowercase if its \lccode is equal to its character code.

Is this all? In particular, does TeX use the \uccode for any purpose other than the \uppercase primitive? What about other engines, pdfTeX, XeTeX, and LuaTeX?

Best Answer

The \lccode of a character is used in hyphenation when \uchyph is set to zero:

\documentclass{article}
\begin{document}

\uchyph=0 %

\begingroup
  \lccode`\C=`\C
  Some filler text. 
  Some filler text. 
  Some filler text. 
  Some filler text. 
  Capitalised word.
  \par
\endgroup

\begingroup
  \lccode`\C=`\c
  Some filler text. 
  Some filler text. 
  Some filler text. 
  Some filler text. 
  Capitalised word.
  \par
\endgroup

\begingroup
  \uccode`\C=`\C
  Some filler text. 
  Some filler text. 
  Some filler text. 
  Some filler text. 
  Capitalised word.
  \par
\endgroup

\begingroup
  \uccode`\C=`\c
  Some filler text. 
  Some filler text. 
  Some filler text. 
  Some filler text. 
  Capitalised word.
  \par
\endgroup

\end{document}

Notice that \uchyph is therefore misleadingly-named, as what is tested is whether the word starts with a lower case letter (one with \lccode equal to itself).

Related Question