[Tex/LaTex] Using includegraphics makes pdf very large

graphicspdf

I am using \includegraphics in pdflatex to extract and display single pages from a pdf file with 30 pages. Each page is a graph and the pdf contains up to 30 graphs. The resulting pdf is enormous (over 9MB) even though I only have 10 or so figures (plots using Python's matplotlib).

This is what I do within a figure environment to display a figure. 17 is the page number of the graph within AllGraphs.pdf

\includegraphics[page=17,scale=0.5]{AllGraphs.pdf}

I have the impression that pdflatex is embedding the entire pdf file of graphs in my document each time I include a single graph.

If so, is there any way to stop this so that only the displayed pdf file is shown.

Update: I have just done an analysis. Each time I add a call to includegraphics the size of the resulting file increases roughly by about a third to a half the size of the entire pdf from which I am extracting the single page (there are 36 pages in each pdf file). Without figures my pdf is about 500k. With 23 images it is over 9MB. That makes about 370k per image. Each file with 36 images is about 700-900k. I was expecting 900/36 equals about 30k per image. Another theory is that my inclusion of the pdf is doing something to the resolution of the included image.

Update 2: The pdf with the initial figures (36 to a file) is produced using PdfPages in Python. I call the following code for each graph. pp is the reference to the plotting object created by calling

pp= PdfPages(fileName)

The for each figure I have

fig = plt.figure(figsize=(8.27, 11.69), dpi=100)
... plotting lines ...
plt.tight_layout()
plt.legend(loc='best')
plt.draw()

pp.savefig(fig)

The file appears to be a perfectly ordinary pdf with 36 pages and a graph on each.

Best Answer

You have got:

  1. source PDF
  2. target PDF

The PDF format is a box and I have no idea how you produce this source PDF. My guess is that the source PDF contains the graphs in a format which can not be used directly by pdfLaTeX, but "fortunately" there is a background software included in your TeX-installation, which produces a large jpg or an eps file from your source PDF while compiling and this jpg is included in your target PDF.

I'm neither an expert on PDF, nor on pdfLaTeX. But could you describe a bit how you produce the source PDF?

OK, however, shrinking a large PDF is possible as well. There is a software called "pdfsizeopt.py", I've been using it for years, on windows as well as on linux.

To avoid trouble with embedded fonts in embedded figures, the first line of my *.tex file is:

\pdfinclusioncopyfonts=1
Related Question