[Tex/LaTex] How to use very large data sets with pgfplots w/o making pdf take a very long time to load

pgfplots

I'm working with a very large data set – almost 120,000 observations – and I'm using pgfplots to make some plots. The process goes ok, especially with the use of the external library, but the pdf documents take a very long time to load. It seems as though adobe is individually drawing each data point.

Is there a better way to use pgfplots to make the plots easier to render?

My set up is:

\documentclass[12pt]{article}
\usepackage{tikz}
\usepackage{pgfplots}
\usepgfplotslibrary{external}
\tikzexternalize

\begin{document}

\begin{figure}[h]
\footnotesize
\centering
\caption{Publicly Available Num Tested, MEAP\_stacked Number Valid}
\begin{tikzpicture}
\begin{axis}[
only marks,
width=6.2in,
ylabel={Number Tested, Public Data},
xlabel={Number Valid, Our Data},
area legend,
legend style = {at={(axis cs:700,0)},anchor=south east},
xmin=-50,
ymin=-50]
\input{statistics/Num_Valid_Writing.tex}
\addlegendentry{Writing}
\input{statistics/Num_Valid_Math.tex}
\addlegendentry{Math}
\input{statistics/Num_Valid_ELA.tex}
\addlegendentry{ELA}
\input{statistics/Num_Valid_Science.tex}
\addlegendentry{Science}
\input{statistics/Num_Valid_SS.tex}
\addlegendentry{Social Studies}
\input{statistics/Num_Valid_Reading.tex}
\addlegendentry{Reading}
\end{axis}
\end{tikzpicture}
\end{figure}

\end{document}

I'm using pdfLaTeX and Ubuntu with TeXLive.

Thanks!

Best Answer

This answer may need pgf CVS, I do not know.

The external lib can be customized to produce png on its own -- with correctly scaled images.

The pgfmanual for pgf (CVS?) describes the process in all detail in Section 34.7 Bitmap Graphics Export in its documentation of the external library.

The trick is to use


\tikzset{
    % Defines a custom style which generates BOTH, .pdf and .png export
    % but prefers the .png on inclusion.
    %
    % This style is not pre-defined, you may need to copy-paste and
    % adjust it.
    png export/.style={
        external/system call/.add={}{; convert -density 300 -transparent white "\image.pdf" "\image.png"},
        %
        /pgf/images/external info,
        /pgf/images/include external/.code={%
            \includegraphics
            [width=\pgfexternalwidth,height=\pgfexternalheight]
            {##1.png}%
        },
    },
    %
    png export,% ACTIVATE
}

in your preamble.

There are three steps involved:

a) modification of the system call (no problem, should be possible with any version of the external lib)

b) the /pgf/images/external info key (which is probably pgf CVS, only) which generates TeX dimension information for every exported image and stores it appropriately.

c) the use of \pgfexternalwidth when loading the image (this is actually output of this 'external info' thing).

The \tikzset statement in my suggested solution defines a new style 'png export'. Rather than activating it globally as in my solution, you may just enable it for a single picture by


{% from here on, png export will be used....
\tikzset{png export}%
\begin{tikzpicture}
....
\end{tikzpicture}
}% here ends the png export