[Tex/LaTex] Diacritics above dotted i (i with tittle)

accents

It is taught that, in order to place an accent mark on top of the letter "i", one should use \i to remove the tittle (the dot) first. (That the dot may need to be removed is of course a matter of convention. To indicate vowel length in Latin, people place a macron above the "i" with or without tittle; that is, I have encountered both ways of doing things.)

However, strangely the accent commands \', \`, ^, \", and \. seem to remove the tittle automatically:

accents above i with and without tittle

\documentclass{article}


\begin{document}

\verb|\ACCENT{i}|:\hphantom{\texttt{A}}\quad
  \'{i} \`{i} \^{i} \"{i} \.{i} \~{i} \={i} \H{i} \r{i} \v{i} \u{i}

\verb|\ACCENT{\i}|:\quad
  \'{\i} \`{\i} \^{\i} \"{\i} \.{\i} \~{\i} \={\i} \H{\i} \r{\i} \v{\i} \u{\i}

\end{document}

1. Since when has it been the case that an accent command in (La)TeX removes the tittle from the letter "i" in those 5 cases? (And are there any special cases I left out?)

2. How can I avoid this? That is, I would like the accent to be placed above the letter i with tittle.

Interestingly, the tittle is left intact for the letter "j":

accents above j with and without tittle

Best Answer

In the definition files for output encodings, you can find the defined combinations that are substituted with precomposed accented characters; for instance, in t1enc.def we find

\DeclareTextComposite{\.}{T1}{i}{`\i}
\DeclareTextComposite{\.}{T1}{\i}{`\i}
\DeclareTextComposite{\`}{T1}{i}{236}
\DeclareTextComposite{\`}{T1}{\i}{236}
\DeclareTextComposite{\'}{T1}{i}{237}
\DeclareTextComposite{\'}{T1}{\i}{237}
\DeclareTextComposite{\^}{T1}{i}{238}
\DeclareTextComposite{\^}{T1}{\i}{238}
\DeclareTextComposite{\"}{T1}{i}{239}
\DeclareTextComposite{\"}{T1}{\i}{239}

so that

\.i \`i \'i \^i \"i

are completely equivalent to

\.{\i} \`{\i} \'{\i} \^{\i} \"{\i}

Here's the corresponding code in ot1enc.def, where just the dot above accent corresponds to a precomposed glyph (of course the normal dotted ‘i’):

\DeclareTextComposite{\.}{OT1}{i}{`\i}
\DeclareTextComposite{\.}{OT1}{\i}{`\i}
\DeclareTextCompositeCommand{\`}{OT1}{i}{\@tabacckludge`\i}
\DeclareTextCompositeCommand{\'}{OT1}{i}{\@tabacckludge'\i}
\DeclareTextCompositeCommand{\^}{OT1}{i}{\^\i}
\DeclareTextCompositeCommand{\"}{OT1}{i}{\"\i}

Note that those predefined combinations in t1enc.def correspond exactly to the available precomposed accented letters in a T1 encoded font. For the other accent, the combination is not defined, so one needs \i in order the letter loses the dot.

Nobody, however, prevents you from defining your own composites:

\documentclass{article}

\DeclareTextCompositeCommand{\~}{OT1}{i}{\~\i}
\DeclareTextCompositeCommand{\=}{OT1}{i}{\=\i}
\DeclareTextCompositeCommand{\H}{OT1}{i}{\H\i}
\DeclareTextCompositeCommand{\r}{OT1}{i}{\r\i}
\DeclareTextCompositeCommand{\v}{OT1}{i}{\v\i}
\DeclareTextCompositeCommand{\u}{OT1}{i}{\u\i}


\begin{document}

\verb|\ACCENT{i}|:\hphantom{\texttt{A}}\quad
  \'{i} \`{i} \^{i} \"{i} \.{i} \~{i} \={i} \H{i} \r{i} \v{i} \u{i}

\verb|\ACCENT{\i}|:\quad
  \'{\i} \`{\i} \^{\i} \"{\i} \.{\i} \~{\i} \={\i} \H{\i} \r{\i} \v{\i} \u{\i}

\end{document}

enter image description here

Why are just some combinations involving ‘i’ defined and not all of them? Because those are the most common, for they correspond to glyphs actually in the encoding, whereas the uncommon ones would have wasted precious memory (remember that when LaTeX2e was released, computers were quite different from the current machines).


How to “undefine” those combinations with ‘i’? One has to know that \DeclareTextCompositeCommand{\"}{OT1}{i}{\"\i} defines the macro

\\OT1\"-i

to expand to \"\i, while \DeclareTextComposite{\.}{OT1}{i}{`\i} defines

\\OT1\.-i

to expand to \char`\i (that is, ‘print an i’). LaTeX tests for the existence of this macro when trying an accent. Note the peculiar name, where the first backslash denotes the escape character, whereas the inner ones are just characters. You can define a function for undefining the combinations you want to exclude:

\documentclass{article}

\newcommand{\UndeclareTextComposite}[3]{%
  \expandafter\let\csname\expandafter\string\csname #2\endcsname\string#1-\string#3\endcsname\relax
}
\UndeclareTextComposite{\'}{OT1}{i}
\UndeclareTextComposite{\`}{OT1}{i}
\UndeclareTextComposite{\^}{OT1}{i}
\UndeclareTextComposite{\"}{OT1}{i}
\UndeclareTextComposite{\.}{OT1}{i}

\begin{document}

\verb|\ACCENT{i}|:\hphantom{\texttt{A}}\quad
  \'{i} \`{i} \^{i} \"{i} \.{i} \~{i} \={i} \H{i} \r{i} \v{i} \u{i}

\verb|\ACCENT{\i}|:\quad
  \'{\i} \`{\i} \^{\i} \"{\i} \.{\i} \~{\i} \={\i} \H{\i} \r{\i} \v{\i} \u{\i}

\end{document}

enter image description here


Regarding the question about “since when are those combination defined”, I could find nothing in the LaTeX sources, so my guess is that they have been there from the start of LaTeX2e. I remember a conversation with Claudio Beccari who claimed to have insisted with the LaTeX team for their inclusion; this addition possibly happened before the release of LaTeX2e, when the business of font encoding was being developed.


Finally, note that BibTeX accepts {\×i} and {\×\i} (where denotes one of the accent commands \., \`, \', \^, \") equally. (adapted from a comment elsewhere)

Related Question