Consider the following code:
\documentclass{article}
\begin{document}
We define multiplication by
$$v_1x = 0,\quad v_2x = v_1,\quad v_1y = -v_1,\quad v_2y = v_1$$
\end{document}
which looks like this
It's as simple as it gets, but when I copy the content of the PDF I get:
We de??ne multiplication by
v1x = 0; v2x = v1; v1y = ??v1; v2y = v1
1
So there are 4 errors:
- The fi in define disappears
- , becomes ;
- The minus sign is not being copied properly
- Another 1 appears at the very end of the copied text
How can I prevent this from happening? I use Texmaker with Miktex 2.9 and pdfLatex.
Best Answer
Unicode mapping based on font encoding
Packages
cmap
ormmap
add information about glyph to Unicode conversions into the PDF file based on the used TeX encoding. The hooks into the font loading mechanism of LaTeX and should be used as early as possible, e.g.:Package
mmap
is used here, because it has better math support AFAIK.Unicode mapping based on glyph name
An alternative is a feature of pdfTeX that adds the mapping to Unicode based on the name of the glyph in the font. Therefore it does not work for PK fonts, because they do not contain glyph names.
Caution: Package
cmap
ormmap
cannot be used together with\pdfgentounicode
. The result would be a duplicated entry in the font data dictionary. This is not allowed in the PDF specification:And copy&paste yield a random result depends on the PDF viewer.
Font encoding
Especially if you have accented characters or more special symbols you should consider using
T1
font encoding. The default encoding for LaTeX isOT1
that support 7-bit only (max. 128 glyphs). Accented characters are constructed, that's bad for copy&paste:You should have installed the
cm-super
font bundle that contain Type 1 versions of the EC fonts. Or use the modern Latin Modern fonts. They descend from the CM/EC fonts.