[Tex/LaTex] exporting R ggplot2 plots to LaTex

I usually export ggplot2 plots from R using "Save As" PDF and works fine, then I use \includegraphics from LaTex to include the fig. The problem is that it seems that the amount of data in these exported PDF from R are big e.g. over 600K each. If I export several PDF documents like this, the end PDF is super laggy to open.

My question is: how can I change either R or load differently from LaTex so that the end document is not so big while the quality of the plots is still very high?

Best Answer

If it gets to the point where a PDF is bigger than a PNG at a high resolution for your end use, then make a high-resolution PNG file. For example, for a figure that will be 4 inches square on a page for print, you want 300dpi x 4 = 1200 pixels. So do:

png(file="plot1.png",width=1200,height=1200)
makemyplot()
dev.off() # close the png file

The cutoff point for this will vary on how much the PNG compression algorithm can do and how much overlapping "ink" you've got - all the ink in a PDF takes space, but you can do a million points to a PNG at the same location and it will only be a few bytes.

There are also assorted PDF compression tools, but I'm guessing you're on Windows and I use Unix command-line tools. While I'm on about command line tools, you can probably convert existing PDFs to raster PNGs with the 'convert' tool from the ImageMagick suite.

But 600k isn't that big. I'm working on a book with 2 megabyte PDFs on almost every page.

Process

We implemented a process that allowed people to work in LaTeX and then switch to .docx for collaborators.

Define a class file that contains the correct formatting, etc, using article, report or book classes. Include the minimum number of up-to-date packages in the class and add the nag package to make sure that you (and other users) can see that those packages are not deprecated.
Create a template showing how to use the class file
Create an SVN (or git, or whatever) repository for the class and template files, and distribute the URL of the repository to LaTeX users
Create documents using the lab-standard class file
Convert the tex files to .docx using Pandoc, which works on Windows, Mac, and Linux
Get edits and peer reviews done on the .docx
Transfer edits from the .doc or .docx document back in to 'tex manually, and complete the PDF production in LaTeX.
Tagging the document using Adobe Acrobat for Section 508 compliance (accessibility).

N.B. Using one of the web-based editors like sharelatex.com or overleaf.com can remove the need for 5-7, especially now that they have rather good review tools.

Challenges

There were a couple of challenges we had to face to get this adopted.

Getting the editors and reviewers something that fit with their existing process, hence the use of the .docx format
Figuring out how to get the same class file(s) to all users, hence the SVN repository
Making sure people know how to use it, hence the template
Figuring out tools that let people collaborate. But that's a whole other post!

508 Compliance / Structured PDFs

The one thing that is still causing trouble is 508-compliance. I have been working (slowly) on using the pdfcomment package to add tooltips and modifying the accessibility package so that documents are accessible. My test PDF documents sometimes pass automated testing in Adobe Acrobat...

Repository

I've put a set of demo documents in a Github repository which may be helpful.

Note re. Pandoc

3 Dec 2017: Originally I suggested the use of latex2rtf instead of Pandoc. I am now editting this answer to suggest Pandoc as I find Pandoc is kept up to date, works well, and I like the flexibility to choose from many more input and output file types.

[Tex/LaTex] Sweave, Beamer and ggplot2

Here is an example where I shamelessly copied some R code from Cross-Validated. It can be compiled in many ways, but personally I used

R CMD Sweave 1.Rnw
pdflatex 1.tex

where 1.Rnw actually reads:

\documentclass[a4paper,11pt]{article}

\title{A sample Sweave demo}
\author{Author name}
\date{}

\begin{document}

\SweaveOpts{engine=R,eps=FALSE,pdf=TRUE,strip.white=all}
\SweaveOpts{prefix=TRUE,prefix.string=fig-,include=TRUE}
\setkeys{Gin}{width=0.6\textwidth}

\maketitle

<<echo=false>>=
set.seed(101)
library(ggplot2)
library(ellipse)
@

<<>>=
n <- 1000
x <- rnorm(n, mean=2)
y <- 1.5 + 0.4*x + rnorm(n)
df <- data.frame(x=x, y=y)

# take a bootstrap sample
df <- df[sample(nrow(df), nrow(df), rep=TRUE),]

xc <- with(df, xyTable(x, y))
df2 <- cbind.data.frame(x=xc$x, y=xc$y, n=xc$number)
df.ell <- as.data.frame(with(df, ellipse(cor(x, y), 
                                         scale=c(sd(x),sd(y)), 
                                         centre=c(mean(x),mean(y)))))
p1 <- ggplot(data=df2, aes(x=x, y=y)) + 
  geom_point(aes(size=n), alpha=.6) + 
  stat_smooth(data=df, method="loess", se=FALSE, color="green") + 
  stat_smooth(data=df, method="lm") +
  geom_path(data=df.ell, colour="green", size=1.2)
@

\begin{figure}
  \centering
<<fig=true,echo=false>>=
print(p1)
@
\caption{Here goes the caption.}
\label{fig:p1}
\end{figure}

\end{document}

With Beamer, you just have to replace the first line with

\documentclass[t,ucs,12pt,xcolor=dvipsnames]{beamer}

or add whatever customizations you want, replace \maketitle with something like \frame{\titlepage}, and then enclose every code chunks with a \begin{frame}[fragile] ... \end{frame} statement. Compilation goes the same way as aforementioned.

Code chunks can be customized using, e.g.

\DefineVerbatimEnvironment{Sinput}{Verbatim}
{formatcom = {\color{Sinput}},fontsize=\scriptsize} 
\DefineVerbatimEnvironment{Soutput}{Verbatim}
{formatcom = {\color{Soutput}},fontsize=\footnotesize}
\DefineVerbatimEnvironment{Scode}{Verbatim}
{formatcom = {\color{Scode}},fontsize=\small}

It requires fancyvrb and needs to be somewhere after the \begin{document}. Personally, I hold in an external configuration file, among other stuff,

\definecolor{Sinput}{rgb}{0.75,0.19,0.19}
\definecolor{Soutput}{rgb}{0,0,0}
\definecolor{Scode}{rgb}{0.75,0.19,0.19}

Here is a snapshot:

enter image description here