[Tex/LaTex] Proper use of cmap and mmap

comparisonfont-encodingspdftex

I am curious about the differences between

  • \usepackage{cmap}
  • \usepackage[resetfonts]{cmap}
  • \usepackage{mmap} (= \usepackage[useTeX]{mmap})
  • \usepackage[noTeX]{mmap}

What are their precise differences? (I am not familiar with (La)TeX internals, so a beginner's explanation would be very helpful.) If this matters: I am using \usepackage[T3,T1]{fontenc} (with T3 being necessitated by the tipa-package), which is important in my case (but might not be important for a fully general answer to this question).

((Also, I am assuming that each of these commands would need to be loaded after loading package fixltx2e but before loading packages times and fontenc. Is that right?))

Update: This starter code is great. I have uncommented the tipa-lines, saved the file in UTF-8 (and modified the input encoding line to \usepackage[utf8]{inputenc}, and went through all different cmap/mmap-options. However, I see no difference in the text that is pasted from the output pdf-file (perhaps because the MWE uses virtual fonts (by default?)?). Ideally, perhaps someone could help (both of us have tried a lot) to find a minimal example that shows five different output pasting behaviors when the only difference is whether no cmap/mmap or one of the 4 cmap/mmap options is loaded. Ideally that example would also specify the encoding the file is saved in to make this work (remember to not use a BOM for UTF-8; more importantly I don't know whether all input encodings lead to identical behavior (assuming correct escaping of accented characters in input encodings that don't offer a particular character)).

Best Answer

The package mmap does a little bit more than cmap, it also works for mathematical symbols in your pdf.

So if your pdf does not use mathematics use \usepackage{cmap}. If you have problems with ligatures further on with computer modern use \usepackage[resetfonts]{cmap}. With mathemtic symbols use \usepackage{mmap}. If you have still problems use \usepackage[noTeX]{mmap}.

The differences are:

  • \usepackage{cmap}: accepted preloaded fonts without reloading.
  • \usepackage[resetfonts]{cmap}: as you can read in the README of cmap this forces the reloading of preloaded fonts (Computer Modern).
  • \usepackage[useTeX]{mmap} and \usepackage{mmap}: does everything cmap does plus correcting mathematical symbols in your pdf; uses new -m.cmap files ("uses ascii strings for the macro-names").
  • \usepackage[noTeX]{mmap}: does everything cmap does plus correcting mathematical symbols in your pdf; uses the cmap files (unicode).

Load cmap or mmap first, then fontenc and babel.

The documentation of fixltx2e does only say "load in the preamble". I had no problems loading it after fontenc, babel and the used fonts.

To do your own experiments use the follwing MWE:

\listfiles                      % shows used files
\documentclass[12pt]{scrartcl}
%\usepackage{cmap}              % pure T1 fonts 
%\usepackage[resetfonts]{cmap}  % pure T1 fonts, reset CM
%\usepackage{mmap}              % cmap + mathematics (ASCII)
%\usepackage[noTeX]{mmap}       % cmap + mathematics (Unicode)

 \usepackage[Latin9]{inputenc}  % or utf-8
 \usepackage[T1]{fontenc}       % font encription 
%\usepackage[T3,T1]{fontenc}    % T3 for package tipa
%\usepackage{tipa}              % Phonetic alphabet
 \usepackage[ngerman]{babel}    % neue deutsche Rechtschreibung

%\usepackage{lmodern}           % Latin Modern
%\usepackage{tgpagella}         % has no virtual fonts
%\usepackage[osf]{mathpazo}     % Minuskelziffern okay

%\usepackage{libertine}         % Libertine Legacy (with virtual fonts)
 \usepackage[osf]{libertine}    % mit Medivalziffern bzw. Minuskelziffern

\newcommand*{\III}{\libertineGlyph{Threeroman}}
\newcommand*{\IV}{\libertineGlyph{Fourroman}}


\begin{document}

Römische Zahlen: \III, \IV.

\textsc{Ligaturen}: auffliegen auffinden finden Auflage Schifffahrt.

\textsc{Korrekt}: auf\/fliegen auf\/finden finden Auf\/lage Schiff\/fahrt.

Ziffern: 0123456789.

Donau Donaudampfschiff Donaudampfschifffahrt Donaudampfschifffahrtskapitän 
Donaudampfschifffahrtskapitän 
Donaudampfschifffahrtskapitän Donaudampfschifffahrtskapitän 
Donaudampfschifffahrtskapitän Donaudampfschifffahrtskapitän

%\textipa{[\!b] [\:r] [\;B]}

\end{document}

Set or delete the comment sign to test cmap and mmap with or without fontenc and different fonts.

BTW: "Donaudampfschifffahrtskapitän" is a German word, good to get hyphenations.