[Tex/LaTex] How to get intersection point(s) of two glyphs

fontssymbols

Longer post

I'd like to introduce you my long-term task which I wish to see solved one day.

Let's have two glyphs (any fonts, glyphs, location, transformation), one fixed, one moveable and some (randomly selected) direction. How to find a position of the second glyph where those two glyphs touch?

I'm using manual positioning and it cannot be used for tasks of word cloud magnitude or other optimization tasks (filling a glyph by smaller glyphs like the shapepar package can handle within basic shapes etc.).
I haven't done extensive research on word cloud (also known as tag cloud) methods and algorithms, but my first impression was that the algorithms use bounding box of the words or of the glyphs. But I might be mistaken (I'm), please see this example from Wikipedia. We can clearly see there are words inside letters, so they are using some complex algorithms. This is a fine starting point: http://www.jasondavies.com/wordcloud/about/. More about algorithms.

A note to computer scientists. This task belongs under collision detection (processing images). It has a wide area of use not only in visualization. I enclose a word cloud example and filling (any) space/glyph examples:

wikipedia: word cloud

filling space situation

Let me illustrate this situation. Let's choose letters a and b from the Latin Modern family, left-right direction, no transformation. We are getting (in this example I'm using \kern):

mwe, part 1

Update: From mathematical point of view, there might be and often are more solutions when we move the second glyph further to the left. For instance, when we typeset ab in the first example we are getting the second solution:

mwe, part 1, next solution

Let's try another example with some transformation of letter b and direction of 20 degrees (I'm using TikZ for glyph movement):

mwe, part 2

After finding an intersection point we could move the second glyph backwards to get a certain amount of white space reserve between two glyphs in that direction.

This is one more example, which could be used for two-step optimization. First step would be finding an intersection point and after setting up an anchor we could scale (or we could rotate) the second glyph and we would find next intersection point. It's rather complex example:

mwe, part 3


What are our options?

I'm using manual transformation (location, rotation, scale) of the second glyph. That's not practical way.

My first (poor) idea was to get a series of raster graphics where I would compute number of non-white pixels. If a number of pixels of letter a and b typeset separately would be the same as number of pixels of both letters typeset together, that would be the proof they are not touching yet. It would be computationally ineffective. Update: When we try to move letters in How the Word Cloud Generator Works, it looks they use rasterization and this technique. They use spiral path to find the best spot for placing the next word. It's possible it isn't that poor idea. It would be an interesting problem at TeX.SX if we are able to raster a vector graphics on-the-fly and process it at the pixel level. I'm thinking of Lua(TeX) and its tools (or TikZ and a new externalization library, perhaps?). I enclose an example from Jason Davies's website. His main website and those examples are inspiring, http://www.jasondavies.com/. Red means the glyphs are not touching, yet, the orange means they are overlapping.

rasterization technique

I was hoping that I could turn off kerning and move the second glyph by its left bearing. But that's not enough as we try to consider also glyph transformation.

I started to believe I'm getting closer when I noticed that the glyph consists of Bézier curves which are stored in the font file (PFB, TTF, OTF) in a form of control points. We can get Bézier curves from the font files then.
I enclose a small example from the Metapost manual, see page 50 for further details, where the control points can be seen. The native output format of Metapost is PostScript, we can get SVG and PNG by Metapost or by other conversion tools easily. We could use pstoedit to export the PostScript format to some more vector formats.

We run the following lines:

mpost mal-mpost.mp  
ps2pdf mal-mpost.56  
pdfcrop --hires mal-mpost.56.pdf  

The content of mal-mpost.mp is this:

fontmapfile "=lm-ec.map";
beginfig(56);
picture q;
path p;
interim ahlength := 12bp;
interim ahangle := 25;
q := glyph "Dcaron" of "ec-lmr10" scaled .2;
for item within q:
p := pathpart item;
drawarrow p withcolor (.6,.9,.6)
withpen pencircle scaled 1.5;
for j=0 upto length p:
pickup pencircle scaled .7;
draw (point j of p -- precontrol j of p)
dashed evenly withcolor blue;
draw (point j of p -- postcontrol j of p)
dashed evenly withcolor blue;
pickup pencircle scaled 3;
draw precontrol j of p withcolor red;
draw postcontrol j of p withcolor red;
pickup pencircle scaled 2;
draw point j of p withcolor black;
endfor
endfor
endfig;
bye;

mwe, metapost

Another way would be to use FontForge and export glyphs to a single SVG file or to a series of the SVG files (one file per glyph). Our input would look like this (cut version):

<path fill="#000000" stroke="none" d="
M 16.07 598.52
C 12.15 596.38 10.29 593.46 10.29 589.48
C 10.29 584.21 13.72 581.58 25.70 577.59
C 43.87 571.58 60.97 564.17 63.35 561.28
C 64.70 559.64 65.31 557.52 64.85 556.06
C 63.64 552.26 66.69 541.58 70.74 535.48
C 72.76 532.45 76.78 528.51 79.69 526.75
C 86.19 522.82 101.12 511.82 102.10 510.26

The SVG format is using M for move, L for a straight line and C for Bézier curve. As we can see from the above example to have the control points is not enough for us, we need to reconstruct the Bézier curve(s).

There is a small potential improvement. We could use svg2tikz script to convert it to the TikZ code. The TikZ code would look like this (cut version):

\begin{tikzpicture}[y=0.80pt, x=0.8pt,yscale=-1, inner sep=0pt, outer sep=0pt]
\begin{scope}[shift={(0,0)},scale=1.000]
  \path[fill=black] (16.0700,598.5200) .. controls (12.1500,596.3800) and
    (10.2900,593.4600) .. (10.2900,589.4800) .. controls (10.2900,584.2100) and
    (13.7200,581.5800) .. (25.7000,577.5900) .. controls (43.8700,571.5800) and
    (60.9700,564.1700) .. (63.3500,561.2800) .. controls (64.7000,559.6400) and
    (65.3100,557.5200) .. (64.8500,556.0600) .. controls (63.6400,552.2600) and
    (66.6900,541.5800) .. (70.7400,535.4800) .. controls (72.7600,532.4500) and
    (76.7800,528.5100) .. (79.6900,526.7500) .. controls (86.1900,522.8200) and
    (101.1200,511.8200) .. (102.1000,510.2600) .. controls (102.4800,509.6300) and 

I was thinking that it might be a way. Once we have a series of curves we can find intersection points. I enclose a snippet of what I think might work. That's an intersection of two paths, I've chosen left-right direction. The variable \t in TikZ holds information about number of points (0, 1, 2, …). When \t=0 that means the paths are not touching, any other value means they are. We can get position of the intersection points, please see the TikZ manual and section about the intersections library, please see pages 139+ or 987+ in the TikZ3 manual.

I'm sure that finding intersection points would be than easy in other programs (Metapost, PSTricks, Asymptote, …). My idea is to prepare two sets of paths, one set for each glyph, and compare all paths from one set with paths in the second set. In theory, that should assure that we won't miss any intersection point. This will require some experimenting.

mwe, tikz, intersection of two paths

LuaTeX is preprocessing font files and store information about glyphs in the Lua file, e.g. please see lmroman10-regular.lua (cut version). We can find bounding box and other information, but not the control points of the Bézier curves. That would increase size of Lua file significantly. We may use FontForge and its native SFD format or we could convert TTF/OTF font files to XML by other tools, e.g. by TTX (tested), or to SVG, e.g. by ttftosvg (http://everythingfonts.com/ttf-to-svg) (untested). We would get the control points but it leaves the glyphs transformation unprocessed.

return {
 ["cache_version"]=2.749,
 ["descriptions"]={
  [32]={
   ["boundingbox"]=1,
   ["index"]=103,
   ["name"]="space",
   ["width"]=333,
  },
  [33]={
   ["boundingbox"]={ 86, 0, 192, 716 },
   ["index"]=53,
   ["name"]="exclam",
   ["width"]=278,
  },
  [34]={
   ["boundingbox"]={ 102, 423, 272, 705 },
   ["index"]=93,
   ["name"]="quotedbl",
   ["slookups"]={
    ["ctx_tlig_1_1"]=8221,
   },
   ["width"]=374,
  },

I enclose the TeX code of the snippets mentioned in the post. We can run xelatex and lualatex engines which can handle loading of the OTF file directly.

% run: xelatex or lualatex mal-intersection.tex
\documentclass[a4paper,landscape]{article}
\pagestyle{empty}
\usepackage{tikz}
\tikzset{inner sep=0pt, outer sep=0pt, yes/.style={green}, no/.style={red}}
\usetikzlibrary{intersections}
\usepackage{fontspec}
\setmainfont[Scale=15]{lmroman10-regular.otf}

\begin{document}

a{\color{red}b} % common typesetting
a\kern-6.5mm{\color{green}b} % manual solution 
\begin{tikzpicture}
\begin{pgfinterruptboundingbox}
\draw[no,<-,line width=2mm] (5,4)--(8,4);
\end{pgfinterruptboundingbox}
\end{tikzpicture}%

\newpage
a{\color{red}b}\hspace{20mm} % common typesetting
a\kern-51mm{\color{green}b} % manual solution 
\begin{tikzpicture}
\begin{pgfinterruptboundingbox}
\draw[no,<-,line width=2mm] (-12,4.5)--(-6,4.5);
\end{pgfinterruptboundingbox}
\end{tikzpicture}%

\newpage
\begin{tikzpicture}
\node{a};
\node[rotate=45,no] at (20:3cm){b};
\draw[no,<-,line width=2mm] (20:5cm)--(20:7cm);
\end{tikzpicture}%
\begin{tikzpicture}
\node{a};
\node[rotate=45,yes] at (20:2.5cm){b};
\end{tikzpicture}

\newpage
\begin{tikzpicture}
\node{a};
\node[scale=0.1,yshift=-40mm,no,rotate=-45]{b};
\draw[no,->,line width=2mm] (0,2mm)--(0,10mm);
\end{tikzpicture}%
\begin{tikzpicture}
\node{a};
\node[scale=0.1,yshift=-13.5mm,no,rotate=-45]{b};
\draw[no,<-,line width=2mm] (0,2mm)--(0,10mm);
\end{tikzpicture}%
\begin{tikzpicture}
\node{a};
\node[scale=0.1,yshift=-13.5mm,rotate=-45,yes,scale=3,xshift=8mm,yshift=-13mm]{b}; 
% + setting an anchor
\end{tikzpicture}%

\begin{tikzpicture}[line width=1mm]
\draw[name path=first] (-5,10) .. controls (0,5) and (0,-5) .. (-1,-1);
\draw[name path=second,no] (1,10) .. controls (0,5) and (0,-5) .. (1,-1);
\draw[no,<-,line width=2mm] (-20mm,70mm)--(0,70mm);
\end{tikzpicture} %
\begin{tikzpicture}[line width=1mm]
\draw[name path=first] (-5,10) .. controls (0,5) and (0,-5) .. (-1,-1);
\draw[name path=second,xshift=-8.1mm,yes] (1,10) .. controls (0,5) and (0,-5) .. (1,-1);
\draw[name intersections={of=first and second, name=i, total=\t}] 
  node {\typeout{total points: \t}}; % total points: 2 (-8.1mm)
\end{tikzpicture}%

\end{document}

My humble proposals

1. True/False

After \checkme{glyphs}{glyphs} we might get:

  • True, if glyphs have some intersection points. Subquestion: How many we have got?
  • False, if they don't. Subanswer: We have got 0 intersection points.

2. Distance

After \checkdistance{glyphs}{glyphs}{angle} we might get:

  • Some dimension, e.g. 2.3cm, which measure minimum distance between those glyphs in that direction.
  • We should get all the solutions, e.g. when glyphs are inside glyphs, the touching point will be get from the opposite side of the glyphs etc.
  • We should get warning or \dimenmax if there is no solution and no intersection can be found. The subquestion could be what's the nearest change in angle to get some intersection point.

3. Graphical representation

  • If the glyphs are already overlapping we should get marks at those points (the intersection points of the Bézier curves).
  • We might get that overlapping region in different color.
  • We should get a list of intersection points as a list of coordinates in a log file and/or at the terminal.
  • The subquestion could be what's the minimum distance (and its angle) to get no intersection points (we would be separating overlapping glyphs).

4. Filling space

  • How to fill some empty space with a glyph (using location, rotation, and/or scale) to get the biggest glyph.
  • We should be able to fix some of the parameters, e.g. location, and let the optimization works with other two parameters (rotation, scale).

5. Rasterization

  • Until now, I have written down some ideas using vector form, but it might be possible to raster our glyphs first and then to do some calculations as algorithms in word/tag clouds do. Outside the TeX we use GhostScript (or some backend tools like ImageMagick or GraphicsMagick) to handle this task.
  • We might be able to use TikZ and its externalization library to get raster graphics (or shall we use PDF -> Metapost conversion with help of the pstoedit tool, perhaps?), but how to process those pictures on-the-fly? With the help of ImageMagick and \write18, perhaps?

Update: A demonstration

I've found an interesting JavaScript demonstration of this task, please try http://paperjs.org/examples/path-intersections/ for yourself.

paperjs.org

Best Answer

a proof of concept

A preliminary version: a proof of concept

{Fanfare} I'm so excited when writing these lines that I cannot even breathe (I'm feeling well, doctor, I'm really fine). :-) Well, it's a big TeX moment for me, you are a part of it!

I was reading Letterpress effect through PSTricks or Tikz where I've noticed that Andrew Stacey converted some STIX fonts to TikZ paths/curves somehow - I'm not sure if the source codes were OTF/PFB files or some other source files (PL, VPL, WFF, EOT, perhaps?). Maybe svgtopgf.pl script from http://bazaar.launchpad.net/~tex-sx/tex-sx/development/files/170 has been used (I've tried some random SVG picture and it wasn't working well for me). It's hard to guess without further research, but it works!

In the meantime, I've read in this question Outlining (filling) glyph outline with text in TikZ that all the major graphics engines (Metapost, PSTricks, Asymptote) should be able to load glyphs at the curve level. I'm not sure about TikZ, but it can load SVG paths, it might help.

Let's get back to our minimal working example.

Step 1: Downloading the fonts converted to paths

I've downloaded two files, pgflibraryshapes.letters.dtx and stikz-normal-paths.tex, to my working directory.

Step 2: Installing dtx file (it includes support TeX files)

I've processed/installed the first file by running this line once:

tex pgflibraryshapes.letters.dtx

Step 3: Running an example and using it

We can run any major LaTeX engine, e.g.

lualatex mal-letters.tex

This is the content of the TeX file:

% run: any major LaTeX engine mal-letters.tex
% Based on and inspired by:
% https://tex.stackexchange.com/q/62570/86
\documentclass{article}
\pagestyle{empty}
\usepackage{tikz}
\usetikzlibrary{shapes.letters} 
\addtolength{\textheight}{2in}
%\usetikzlibrary{fadings} % I am cutting down the example to bare minimum...
%\usetikzlibrary{shadows.blur}
\usetikzlibrary{intersections}
\pgfkeys{
  /pgf/letter/.cd,
  load font={stikz}{normal},
  size=4,
  load encoding=char,
  every letter/.append style={
    fill, draw=red, line width=1pt,
  },
  }% End of \pgfkeys...
\makeatletter
\tikzset{
  use letter path/.code={%
    \pgfscope
    \pgftransformscale{\letter@size}%
    \letter@path{\letter@encode{#1}}%
    \endpgfscope
  }% end of use letter...
}% end of \tikzset...
\makeatother

\begin{document}
%\newcount\malrotate
%\malrotate=-10
%\loop
%\advance\malrotate by 10
\foreach \malrotate in {0,10,...,180} {% 45    0,10,...,180
\begin{tikzpicture}
\begin{scope}[xshift=20mm, yshift=-7mm, rotate=\malrotate]
\path[name path global=first, use letter path=T, fill=red];
% draw=green, line width=1pt,
\end{scope}
\begin{scope}[yshift=6mm, xshift=2mm, rotate=-\malrotate]
\path[name path global=second, use letter path=B, fill=blue, opacity=0.4]; 
% draw=green, line width=1pt,
\end{scope}
\fill[name intersections={of=first and second, name=i, total=\t}] [black]
  \ifnum\t>0%
node{%
  %\pgfmathparse{\t}%
  %\global\let\mtotal=\pgfmathresult
  \typeout{Number of intersection points: \t}%
  }%
\foreach \s in {1,...,\t} {%
  (i-\s) circle (1pt) node[above]{\footnotesize\s}%
  }%
\else
node{\typeout{There are no intersection points!}}%
  \fi;
  %\typeout{Number of intersection points: \t};
\end{tikzpicture} %
}% End of \foreach...
%\ifnum\malrotate<180\repeat
\end{document}

The good news is that it's working, the bad news is that we are getting an incorrect number of intersection points when the picture is drawn within a cycle (\foreach or \loop...\repeat), so there might be a bug (when using name path global? Or maybe there is a problem in the support files? I don't know at the moment.). The very first and the very last message are correct for sure.

Number of intersection points: 4
Number of intersection points: 9
Number of intersection points: 20
Number of intersection points: 32
Number of intersection points: 19
Number of intersection points: 17
Number of intersection points: 23
Number of intersection points: 15
Number of intersection points: 13
Number of intersection points: 10
Number of intersection points: 16
Number of intersection points: 19
Number of intersection points: 17
Number of intersection points: 26
Number of intersection points: 19
Number of intersection points: 24
Number of intersection points: 19
Number of intersection points: 7
There are no intersection points!

This example loads two glyphs and names them. It transforms them with the help of the scope environments as it cannot be done directly in \path command as far as I can say (although I haven't tested the cm parameter, yet). Then we find intersection points and use them (see some more examples and settings in the TikZ manual, search for intersections library). If we draw one picture at a time, the number of points is correct.

Well, there is a lot of work ahead (converting some other fonts, setting up and testing optimization with different parameters on real examples), but this is the core.

I know about practical tasks where I'm going to need it soon for sure:

  • a background picture consisting of touching glyphs for the forthcoming conference proceedings cover, it should be an improved version to this example How to differentiate glyphs by script (semi-)automatically?,
  • a glyph inside a glyph which are inside another glyph..., e.g., letter T inside letter A and both inside letter B, and,
  • putting some words inside an arrow with some specific whitespace reserve around the terms (that's a common case of the word clouds) as an advertisement for local theatre play.

I enclose a preview of that page and a closeup shot with some additional settings.

mwe, full page

mwe, closeup shot

Related Question