For quite a while I am having the problem that copying and searching from my PDFs is a bit difficult as ligatures are not properly translated. I am using XeLaTeX with Libertine/Biolinum.
I am a simple user, so I tried workarounds I found on the internet (Make ligatures in Linux Libertine copyable (and searchable) – \pdfglyphtounicode with XeTeX – Can PDF search find words with ligatures in XeLaTeX-documents?) but all of this doesn't work.
Here's my MWE
%!TEX TS-program = xelatex
%!TEX encoding = UTF-8 Unicode
\documentclass{scrreprt}
\usepackage{fontspec}
%\defaultfontfeatures{Ligatures=Historic}
%\setmainfont{Linux Libertine O}
\usepackage{libertine}
\begin{document}
fluffier soufflé fisticuffs fb fh ffh fj ffj fk ffk ft fft tt Qu Th ch ck ct
\end{document}
Which renders
u er sou é sticu s ch ck ct
for the above and
u er sou é icu s ch ck
when I use the historic ligatures.
Using \input{glyphtounicode}
workaround I get:
Undefined control sequence.
l.7 \pdfglyphtounicode{A}{0041}
Using \usepackage[t1]{fontenc}
I get:
/usr/local/texlive/2014/texmf-dist/tex/latex/base/fontenc.sty:100: LaTeX Error: Encoding scheme `t1' unknown.
See the LaTeX manual or LaTeX Companion for explanation.
Type H for immediate help.
l.100 \fontencoding\encodingdefault\selectfont
Experimenting with other fonts shows very mixed results, so while it's obviously possible that the problem is in the fonts, is there something, anything, I can do to work around this and still keep ligatures?
Something like the above-mentioned
\input{glyphtounicode}
\pdfglyphtounicode{f_f}{FB00}
where I could "translate" the ligatures by hand – the above doesn't work for me, though.
Best Answer
Try adding
\XeTeXgenerateactualtext=1
at the start of your document.(IIRC, I think this requires XeTeX from TeX Live 2016 or later, or an equivalent from other distributions such as MikTeX; and the result of copy/paste will also depend on the PDF reader used, as not all PDF viewers support ActualText annotations.)