There are several ways of setting PDF meta data when compiling LaTeX documents. The two most popular ways are arguably via pdfinfo
,
\pdfinfo{
/Author (Erwin Schrödinger)
}
and hyperref
,
\usepackage[pdftex,
pdfauthor={Erwin Schrödinger},
]{hyperref}
With both methods, however, the above example produces mojibake in the PDF output:
$ pdfinfo main.pdf
Title:
Subject:
Keywords:
Author: Erwin Schrödinger
[...]
Adding \usepackage[utf8]{inputenc}
makes no difference.
How to fix this?
Best Answer
Package
hyperref
hyperref
encodes correctly, but the options should be set afterhyperref
is loaded. Otherwise LaTeX expands the options the hard way andhyperref
will only see the expanded garbage.Extended example:
The meta data strings are encoded in PDFDocEncoding or UTF-16BE with BOM in the PDF file. Full power of Unicode for the meta data and bookmarks are enabled by the following
hyperref
options:pdfencoding=auto
is more flexible thanunicode
. If the string fits the PDFDocEncoding (an 8-bit encoding), then this encoding is used, otherwise Unicode.Low level, manually
The low level version for specifying the meta data with pdfTeX and without
hyperref
would look like:The Unicode variants:
or as hexadecimal string: \pdfinfo{/Author}
The PDF specification tells, which encoding can be used, PDFDocEncoding is listed as full table in Annex D "Character Sets and Encodings".
Low level, but with automatic encoding conversions
If you want to convert from UTF-8 (or other input encoding), then package
stringenc
helps (LaTeX and plain TeX formats): Plain TeX example:A more elaborate example, reimplementing
pdfencoding=auto
:The input need not to be encoding in UTF-8, package
stringenc
supports much more encodings.