[GIS] extract a specific layer from a Geospatial PDF using GDAL/OGR

convertesri-production-mappinggdalgeospatial-pdfogr

I have GDAL/OGR 1.10 installed on a 64bit Windows 7 machine. I have some Geospatial PDFs that I want to crop to the outline of the map and convert to Rasters. However I cannot clip to the neatline. They have been created using Esri Production Mapping, so the neatline is offset from the actual edge of the map, and clipping by the neatline includes the Axis labels and a lot of whitespace.

I can interrogate the Layers of the PDF by using gdalinfo -mdd LAYERS and I know which layer represents the actual neatline/the edge of the map.

What I need to know is how to use GDAL or OGR to convert this layer to a shapefile.

Best Answer

Clipping to the neatline and selecting layers are two separate operations.

To crop to the neatline it is easiest to write it to a file as csv, something like:

id,WKT
1,"POLYGON ((672254.457103024818935 5165931.756110833957791,662673.757448789430782 5165665.745274489745498,662287.066668052924797 5179592.87801924906671,671867.766322288312949 5179858.888855593279004,672254.457103024818935 5165931.756110833957791))"

Then you can tell gdalwarp to crop to the neatline

 gdalwarp in.pdf out.pdf -crop_to_cutline -cutline cutline.csv

To include a specific layer you can use GDAL_PDF_LAYERS config option

gdalwarp in.pdf out.pdf --config GDAL_PDF_LAYERS "Map_Frame.Transportation"

or combine the two together and convert to tiff

gdalwarp -co "TILED=YES" -co "TFW=YES" in.pdf out.tif -overwrite  --config GDAL_PDF_LAYERS "Map_Frame.Transportation" --config GDAL_PDF_DPI 300

The third part of your question, converting to a shapefile, is not possible with GDAL as far as I know, as gdalwarp deals exclusively with rasters, and shapefiles are used for vector data.

Related Question