[Tex/LaTex] Hide text from displaying but retain it selectable and searchable

accsuppmath-modepdfsymbolsunicode

Continuing https://tex.stackexchange.com/a/463516, is there a way to create an invisible PDF text associated with a drawn symbol? Assume that we have some command (say, \sqp) which draws some symbol but produces no searchable/selectable text in the PDF. We want to add this textual representation (say, Π, which be neither displayed nor printed) of the drawn symbol to the PDF file such that it would be selectable and searchable. An example of such text is, e.g., Included PDF image has invisible, but selectable text . Reusing @egreg's code from https://tex.stackexchange.com/a/463516/165772 (Thx!), I tried

\documentclass{article}
\usepackage{amsmath}
\usepackage{xparse}
\usepackage{accsupp}

\ExplSyntaxOn
\NewDocumentCommand{\sqp}{s}
 {
  \IfBooleanTF{#1}
   {
     \mathord { \mathpalette \egreg_sqp:Nn { \mathrm{o} } }
   }
   {
     \mathord {
     \mathpalette \egreg_sqp:Nn { }
     }
   }
 }

\box_new:N \l__egreg_sqp_temp_box
\dim_new:N \l__egreg_sqp_wd_dim % width
\dim_new:N \l__egreg_sqp_ht_dim % height
\dim_new:N \l__egreg_sqp_th_dim % thickness

\cs_new_protected:Nn \egreg_sqp:Nn
 {% #1 = style declaration, #2 = maybe o
  \group_begin:
  \dim_zero:N \mathsurround
  \hbox_set:Nn \l__egreg_sqp_temp_box { $#1\mathrm{o}$ }
  \dim_set:Nn \l__egreg_sqp_wd_dim { \box_wd:N \l__egreg_sqp_temp_box }
  \dim_set:Nn \l__egreg_sqp_th_dim { \box_wd:N \l__egreg_sqp_temp_box/4 }
  \hbox_set:Nn \l__egreg_sqp_temp_box { $#1\Pi$ }
  \dim_set:Nn \l__egreg_sqp_ht_dim { \box_ht:N \l__egreg_sqp_temp_box }

  \mspace{1mu}
  \tl_if_empty:nF { #2 }
   {
    \hbox_to_zero:n
     {
      \hbox_to_wd:nn { \l__egreg_sqp_wd_dim + 2\l__egreg_sqp_th_dim } { \hss $#1#2$ \hss }
      \hss
     }
   }
  \hbox_to_wd:nn { \l__egreg_sqp_wd_dim + 2\l__egreg_sqp_th_dim } { \__egreg_sqp_draw:N #1 \hss }
  \mspace{1mu}
  \group_end:
 }

\cs_new_protected:Nn \__egreg_sqp_draw:N
 {
  \driver_draw_begin:
  \driver_draw_moveto:nn { 0pt } { 0pt }
  \driver_draw_lineto:nn { 0pt } { \l__egreg_sqp_ht_dim }
  \driver_draw_lineto:nn { \l__egreg_sqp_wd_dim + 2\l__egreg_sqp_th_dim } { \l__egreg_sqp_ht_dim }
  \driver_draw_lineto:nn { \l__egreg_sqp_wd_dim + 2\l__egreg_sqp_th_dim } { 0pt }
  \driver_draw_lineto:nn { \l__egreg_sqp_wd_dim + \l__egreg_sqp_th_dim } { 0pt }
  \driver_draw_lineto:nn { \l__egreg_sqp_wd_dim + \l__egreg_sqp_th_dim } { \l__egreg_sqp_ht_dim - 0.7\l__egreg_sqp_th_dim }
  \driver_draw_lineto:nn { \l__egreg_sqp_th_dim } { \l__egreg_sqp_ht_dim - 0.7\l__egreg_sqp_th_dim }
  \driver_draw_lineto:nn { \l__egreg_sqp_th_dim } { 0pt }
  \driver_draw_closepath:
  \driver_draw_fill:
  \driver_draw_end:
 }

\ExplSyntaxOff

\begin{document}
\BeginAccSupp{method=hex,unicode,ActualText=03A0}$\sqp$\EndAccSupp{}
\end{document}

In the text layer, this does not produce Π (as I would expect), but only 1 for the end of document.
(An ocg-based solution, as opposed to the accsup-based solution, would probably require a fresh id on each use of the symbol; I wouldn't know how to generate it automatically, and it doesn't seem to produce selectable text. Anyway, I have not tried it.)

I'd prefer a solution that works with all three engines {pdf|xe|lua}latex, even if one would have to use the conditionals \if.... As noticed by @Raven, making the background white could work (unless you'd overlay the symbol again with something else like an image) — that's what I would do when I run out of options.

Best Answer

In my comments to the OP/question, I proposed three different methods to write text into a PDF which is invisible, but still selectable and searchable:

  1. Use "text rendering mode 3" (don't strike and don't fill) for the glyph shapes.

  2. Use white color for all characters on standard background (or colorize the text with the same color as its background).

  3. Use the standard text color (black), but print it onto a black background.

Here is an MWE for my first proposal. For now I'll not make one for the other two. The following shows how to use "text rendering mode 3" in order to write invisible (but searchable) text into a PDF and "do it by (more-or-less pure) LaTeX commands".

It makes use of...

  1. ...the \pdfliteral command which in general allows to insert raw PDF code into the LaTeX sources which will then be passed into the PDF output generated by pdflatex (not xelatex or lualatex);
  2. ...the PDF code snippet 3 Tr which is PDF syntax in order to set the text rendering mode to 3 (meaning "no stroke and no fill"). -- *(The q is PDF syntax to save the current graphics state, and the Q is for restoring it again to the previous text rendering mode).**

Here is the MWE:

\documentclass{article}
\begin{document}
This is my normal text.\\
\makebox[50pt][c]{\pdfliteral page{q 3 Tr}This is hidden (but searchable) text.\pdfliteral page{Q}}
\\This is more normal text.\\
\end{document}

Compile this into PDF: pdflatex my.tex. Open the PDF in any viewer and select all (try to hit ctrl+a (Windows, Linux) or cmd+a (macOS)):

enter image description here

As you can see, there is something highlighted but not visible in between the two readable sentences. To find out what it is,...

  • ...either run pdftotext -layout my.pdf -,
  • ...or search for the string 'This is hidden (but searchable) text.'

enter image description here