[GIS] Merge spatial and non-spatial data and create SpatialPolygonsDataFrame in r

rrasterspdep

I am trying to merge non-spatial data (data frame in R) and spatial data (SpatialPolygonsDataFrame in R) and eventually create the merged file in SpatialPolygonsDataFrame form.

In specific, I downloaded 'zip areas boundary file' from census web site (https://www.census.gov/geo/maps-data/data/cbf/cbf_zcta.html) and read it as 'SpatialPolygonsDataFrame' file using 'readShapePoly' function in R.

And then I merged the SpatialPolygonsDataFrame with my data in 'data frame' form, but the merged file is 'data frame' form not 'SpatialPolygonsDataFrame'.

Can someone let me know how to create 'SpatialPolygonsDataFrame' after merging spatial data and non-spatial data? I have been spending my entire day just on this but didn't have clues.

The code that I used is as follows:

nation <- readShapePoly("C:/Users/Research/data/nation.shp") 
us_urban_zipcode <- as.data.frame(us_urban_index13[c(39,91)])
us_urban_index_nation <-merge(us_urban_zipcode, nation, by = "ZCTA5CE10", 
all.x=TRUE)

Nation is SpatialPolygonsDataFrame and us_urban_zipcode is dataframe, and merging them results in dataframe, not SpatialPolygonsDataFrame, which I need for further analysis.

My data non-spatial data looks like this:

---------------
zipcode | row
 10003  |  1
 10002  |  2
 10003  |  3
 10004  |  4
 10002  |  5
  ...   | ..
-----------------

And my spatial data looks like this:

----------------------
zipcode | AFFGEOID10
 10001  |  477175
 10002  |  2118827
 10003  |  78588518
 10004  |  9804480
 10005  |  4845242
  ...   |    ..
-----------------------

So, basically my non-spatial data is bigger than spatial data in terms of observations. The zipcode of spatial data is all unique (only one zipcode in each observation), but there are redundant zipcodes for non-spatial data.
But I need to keep all observation in non-spatial data for further analysis.
This is why I used 'all.x =T' or 'all.y = T' in the merge function.

Best Answer

I would recommend reading your shapefile in with rgdal::readOGR. If you run into performance issues you should look up how to read in spatial data and merge data using the sf library and the simple features workflow.

For this to work I like to have column names that are to be merged to be identical before performing my merge. You can also specify column names using the by.x and by.y arguments in the merge function.

library(rgdal)
mydf   <- read.csv("myCsv.csv")
myspdf <- readOGR("myShapefile.shp")

## then merge using sp's merge function
mynewspdf <- merge(myspdf, mydf)

You may get a "non-unique matches detected" error, in which case you can try..

mynewspdf <- merge(myspdf, mydf, duplicateGeoms = T)

See for more info -> https://www.rdocumentation.org/packages/sp/versions/1.2-5/topics/merge

Related Question