Shapefile – Converting High Resolution Digital PDFs Into a Shapefile

convertpdfshapefile

I have to convert several of such formats into a shapefile. Is there any open-source software or quicker ways of doing this?

One such file –
http://ceodelhi.gov.in/ACMAP/5.pdf

Another such file is here- http://ceodelhi.gov.in/WriteReadData/CMS/Documents/DelhiGISmapwithPollingLocations.pdf

Best Answer

Pulling map data out of PDFs is rarely simple.

Sometimes the map is a raster image in the document and you can do no better than that.

Sometimes the map is vector data, but its drawn with no geographic reference, the layers are not separated, and it is very hard to turn that back into proper map data.

Sometimes you get lucky and the map has been created from a GIS that has preserved some of the geographic layering and feature data.

Your two examples are in this last category.

So for example I can load the second one into QGIS and import the data as vector layers. But it looks a mess since quite a bit of the PDF is "design" - boxes, the overview map, etc. If I disable those layers I can see things that maybe you are interested in - for example here's a map showing some polling locations as orange dots and purple polygons, with an OpenStreetMap background in grey which tells us this is in the right place:

enter image description here

All the colour on that map has come from your PDF - so there's a roads layer (which looks slightly shifted with respect to the OSM roads) and some big text labels. If all you want are the polling locations, you could save these to two shapefiles (one for the points, one for the squares which are polygons).

But note there's no feature data with these points - you'll get the coordinates and nothing else.

Your first PDF seems to suffer a bit in translation, and there's no sign of the big polygon boundary which might be the thing you are after from that. I don't know why its not imported but that's what happens with PDF imports. Ask the people who created the PDF for the source data.