[Tex/LaTex] Plotting floating point data with pgfplots

pdfpdftexpgfplotstikz-pgf

Is there a binary file structure supported by pgfplots to allow floating point numbers to be loaded and plotted? I understand the time taken to convert floating point data to a text file is small, but for large amounts of data (such as an animation), the time to process the file can be quite high. It seems illogical to convert the data from a floating point format, only for pgfplots to load and convert it back. Additionally, could somebody please explain how coordinates are translated to PDF data – I believe PDF objects can contain 32 bit floating point numbers. Is it therefore possible to place a floating point coordinate from a binary file directly into a PDF object, thereby avoiding 2 (or 3?) stages of conversion (and indeed retaining lossless coordinates)?

I realise that the structure of data presented in a binary file is less apparent than that of a plain text file, but would be able to manipulate the binary (or binaries) accordingly. I've included a basic example to try and make things clearer.

\documentclass{article}
\usepackage{filecontents,pgfplotstable}
\pgfplotstableset{create on use/x/.style={create col/copy column from table={x.csv}{0}}}
\begin{filecontents}{x.csv}
1.1
1.2
1.3
\end{filecontents}
\begin{filecontents}{y1.csv}
2.5
3.2
2.9
\end{filecontents}
\begin{filecontents}{y2.csv}
2.6
3.0
3.1
\end{filecontents}
\begin{document}
\tikz{\pgfplotstableread{y1.csv}\y
\begin{axis}
\addplot table[x=x,y=0]\y;
\end{axis}}

\tikz{\pgfplotstableread{y2.csv}\y
\begin{axis}
\addplot table[x=x,y=0]\y;
\end{axis}}
\end{document}

Instead of the contents of 'x.csv', 'y1.csv' and 'y2.csv' being those provided above, they would be binaries with the following hexdumps:

x.csv

40200000 404ccccd 4039999a

y1.csv

40266666 40400000 40466666

y2.csv

3f8ccccd 3f99999a 3fa66666

Best Answer

You stated two questions and I will address them briefly:

1 Can pgfplots read binary representations of floating point numbers?

In short: no.

2 How are coordinates mapped into a PDF?

This is a quite complex operation: first all coordinates are collected in order to compute axis limits. This is the "survey phase". Afterwards, all coordinates are mapped into the 32 bit fixed point number range of TeX/PDF/PS. These numbers are written in plain text into the output file (after applying any transformations like scaling or stretching).


Aside from these answers to your questions, I see that you are actually searching for something else for which these answers are merely some "sub-product". You appear to be wondering how to optimize something; apparently loading huge bulks of data files of animations or perhaps processing by means of pgfplots.

I would agree that animations might involve adequate data file formats. But if a binary "CSV" is the right one appears to be questionable. And pgfplots as tool to read "huge animation data files" appears to be as questionable. Do you want to import AVIs?

If you want to reduce the time that pgfplots ponders on data in general, you should pose a feature request or bug report. Note that it is entirely unclear of whether number parsing is a bottle neck at all (a pity that there are no powerful profilers). In fact, I would expect bottlenecks somewhere else.

If you want to improve the quality by reducing numeric operations, you may want to write some low-level driver file which writes stuff directly to a PDF. There are some operations in PDF which actually expect binary data, but they expect mapped integers rather than floating point numbers. I would expect that such an approach produces exactly the same quality as that produced of pgfplots (and would hope for a bug report if not).

I hope these thoughts help to improve the search for an answer and to clarify the use-case(s) that you have in mind.