I am wondering how to join spatial polygons using R code?
I'm working with census data where certain areas change over time and I wish to join the polygons and the corresponding data and simply report on the joined areas. I am maintaining a list of polygons that have changes census to census and that I plan to merge. I'd like to use this list of area names as a lookup list to apply to census data from different years.
I'm wondering what R function to use to merge selected polygons and respective data. I have googled it but simply become confused by results.
Best Answer
The following solution is based on a post by Roger Bivand on R-sig-Geo. I took his example replacing the German shapefile with some census data from Oregon you can download from here (take all shapefile components from 'Oregon counties and census data').
Let's start with loading the required packages and importing the shapefile into R.
Next, you need some grouping variable in order to aggregate the data. In our example, grouping is simply based on the single county coordinates. See the image below, black borders indicate the original polygons, whereas red borders represent polygons aggregated by
oregon.id
.So far, so good. However, data attributes related to the original shapefile's subregions (e.g. population density, area, etc.) get lost when performing
unionSpatialPolygons
. I guess you'd like to aggregate your census data associated to the shapefile as well, so you'll need an intermediate step.You first have to convert your polygons to a dataframe in order to perform aggregation. Now let's take data attribute columns six to eight ("AREA", "POP1990", "POP1997") and aggregate them according to the above IDs applying function
sum
.Finally, reconvert your dataframe back to a
SpatialPolygonsDataFrame
providing the previously unified shapefileoregon.union
and you obtain both generalized polygons and your census data derived from above summarization aggregation step.