I want to create a shapefile for each of the 72,000 census tracts in the United States that includes the census blocks in that tract using open source software.
I will probably start with a state shapefile that includes all of the blocks for that state and merge it with a GDB file. This will give me a list of blocks that are in the census tract.
But then I'm not sure how to split the state shapefile into smaller ones.
How can I do this?
Is there a way to automate this using open source software?
If so, what tools should I use?
I'm learning QGIS.
Best Answer
Basically, you can do the extract using ogr2ogr as long as you give the Census tract ID, so it's really an issue of getting 72,000 ogr2ogr calls.
Notes:
-where
switch to subset the data. Remember that attribute query is much faster than spatial query. Your question was a little vague, so I don't know if you intend to join the blocks to the tracts by attribute or spatially, but I highly recommend the former.So how do you build 72,000 ogr2ogr calls programmatically? You can use any tool you want, but here's an example with R:
You could also collect the system calls in one step, then iterate it to run the ogr2ogr call in a separate loop, or at a later time, or write it to a bash script that you run from the command line.
The major disadvantage is that it will scan the data source once for each ogr2ogr call, so actually importing the source data, iterating, and writing a shapefile for each row, would probably be more efficient. But I would recommend trying it in something other than R, which is somewhat slow at reading large spatial datasets (I ran an import just of the counties of the US, ~3000, and after 15 minutes I cancelled the import.)