[Tex/LaTex] Copying ligatures from XeLaTeX PDF

copy/pasteligaturespdfxetex

For quite a while I am having the problem that copying and searching from my PDFs is a bit difficult as ligatures are not properly translated. I am using XeLaTeX with Libertine/Biolinum.

I am a simple user, so I tried workarounds I found on the internet (Make ligatures in Linux Libertine copyable (and searchable)\pdfglyphtounicode with XeTeXCan PDF search find words with ligatures in XeLaTeX-documents?) but all of this doesn't work.

Here's my MWE

%!TEX TS-program = xelatex 
%!TEX encoding = UTF-8 Unicode 
\documentclass{scrreprt}
\usepackage{fontspec}
%\defaultfontfeatures{Ligatures=Historic}
%\setmainfont{Linux Libertine O}
\usepackage{libertine}
\begin{document}
fluffier soufflé fisticuffs fb fh ffh fj ffj fk ffk ft fft tt Qu Th ch ck ct
\end{document}

Which renders

u er sou é sticu s ch ck ct

for the above and

u er sou é icu s ch ck

when I use the historic ligatures.

Using \input{glyphtounicode} workaround I get:

Undefined control sequence.
l.7 \pdfglyphtounicode{A}{0041}

Using \usepackage[t1]{fontenc} I get:

/usr/local/texlive/2014/texmf-dist/tex/latex/base/fontenc.sty:100: LaTeX Error: Encoding scheme `t1' unknown.

See the LaTeX manual or LaTeX Companion for explanation.

Type H for immediate help.

l.100 \fontencoding\encodingdefault\selectfont

Experimenting with other fonts shows very mixed results, so while it's obviously possible that the problem is in the fonts, is there something, anything, I can do to work around this and still keep ligatures?

Something like the above-mentioned

\input{glyphtounicode}

\pdfglyphtounicode{f_f}{FB00}

where I could "translate" the ligatures by hand – the above doesn't work for me, though.

Best Answer

Try adding \XeTeXgenerateactualtext=1 at the start of your document.

(IIRC, I think this requires XeTeX from TeX Live 2016 or later, or an equivalent from other distributions such as MikTeX; and the result of copy/paste will also depend on the PDF reader used, as not all PDF viewers support ActualText annotations.)