Seems overcomplicated, why not use gIntersect with gSimplify from rgeos?
EDIT: nope, that loses the data frames. Use Raster::intersect instead.
Example:
library(sp)
library(rgdal)
library(raster)
dir <- 'C:/Users/obrl_soil/Downloads'
tz_world <- readOGR(paste0(dir, '/tz_world/world'),
'tz_world',
stringsAsFactors = FALSE)
# from http://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-populated-places/
cities <- readOGR(paste0(dir, '/ne_10m_populated_places'),
'ne_10m_populated_places',
stringsAsFactors = FALSE)
#only need city name for this, pruning the attributes
cities@data <- cities@data[c('NAME')]
# simplify the geometry of the polygons to cut processing time
tz_world_simple <- gSimplify(tz_world, tol = 0.1, topologyPreserve = TRUE)
# reattach the data frame
tz_world_simple <- SpatialPolygonsDataFrame(tz_world_simple, data = tz_world@data)
st <- proc.time()
city_timezones <- intersect(cities, tz_world_simple)
proc.time - st
# results
View(city_timezones@data)
Its still slow on my machine, but I only have 4GB of RAM to play with. A subset of ~600 cities took 4 seconds or so but the full cities dataset crashed the operation.
Update 2017
Since this got bumped, forget the above: here's a solution in sf
library(sf)
library(tidyverse)
setwd('C:/DATA')
tzw <- read_sf(file.path(getwd(), 'world', 'tz_world.shp')) # 27742 obs
places <- read_sf(file.path(getwd(), 'ne_10m_populated_places_simple.shp')) # 7322 obs
# they're in the same crs, so
where_at <- st_intersection(select(places, name), tzw)
Full intersection in under 3 minutes on my machine. 123 points are lost as they do not intersect a tzw
polygon (e.g. Seattle). One could maybe play around with buffers to get the rest.
How to select a specific layer with ogr2ogr is documented in http://www.gdal.org/ogr2ogr.html but not in an ultimately clear way.
src_datasource_name
[-lco NAME=VALUE] [-nln name]
[-nlt type|PROMOTE_TO_MULTI|CONVERT_TO_LINEAR|CONVERT_TO_CURVE]
[-dim XY|XYZ|XYM|XYZM|2|3|layer_dim] [layer [layer ...]]
The layer name(s) are given after the name of the datasource by using space as a separator.
This should work for you:
ogr2ogr -f 'GPKG' -clipsrc 1077228.72991008 4218298.82092178 1211486.08907174 4469736.19302021 output.gpkg input_gpkg Towns
Alternatively you can select whatever you want with SQL:
ogr2ogr -f 'GPKG' -clipsrc 1077228.72991008 4218298.82092178 1211486.08907174 4469736.19302021 -sql "select * from Towns" output.gpkg input_gpkg
Best Answer
To do this in R, construct a URL string like this with your bounding box encoded in it. Use
paste0
or some other string concatenation function to build it. This is based on @user30184 answer:Then you can use
st_read
from thesf
package:Getting about 11,000 buildings:
Your
sf
needs to be linked with a GDAL/OGR that supports these servers:If that returns NA values then your GDAL/OGR needs upgrading/rebuilding.
Here's an R function that downloads the buildings given some bounding coordinates in WGS84::
giving: