[Tex/LaTex] PDF/A validation with embedded CID font subset caused by ligatures in vector image

inkscapeligaturespdf-apdftex

Objective and Workflow

I want to produce a PDF/A-2b compliant PDF document with pdfLaTeX using the Package pdfx. The target document includes PDF vector images, which are exported from SVG sources with Inkscape 0.91 using Save As and leaving Convert texts to paths unchecked. Those vector images contain text with ligatures, such as the one resulting from the character combination "ff". Of course, a font supporting ligatures, Linux Biolinum TrueType in my case, is used. For PDF/A validation, I use the 3-Heights™ PDF Validator Online Tool. See also the example set of files.

Problem

Both the exported PDF vector image and the target PDF document created with pdfLaTeX, which includes the vector image, cause the following validation problem:

The key CIDToGIDMap is required but missing.

Without using ligatures in the text, the PDF file contains only the embedded subset:

  • LinBiolinum with Type: TrueType and Encoding: Ansi

When using ligatures, the PDF file contains an additional embedded subset:

  • LinBiolinum with Type: TrueType (CID) and Encoding: Identity-H

The validation problem occures if and only if the second subset is added, i. e., the use of ligatures causes a second font subset to be included in the PDF, which then causes the PDF/A validation issue. Another tool, veraPDF, associates it to Rule 6.2.11.3-2.

Question

How can the described workflow be adapted to use SVG images including text with ligatures while preserving PDF/A compatibility?

Best Answer

So far I know of no TeX-based software that creates or otherwise handles CIDToGIDMap streams. They are needed particularly with CJK fonts, as used in documents produced with XeLaTeX. This is one of the last few hurdles that prevent generating PDF/A-conforming CJK documents with XeLaTeX.

Proper support is something that I want to incorporate into pdfx.sty, so I'm looking for examples to aid in development. Could you possibly send me a small-ish example document which shows the problem in a non-CJK setting? (ross.moore@mq.edu.au)

Related Question