I can't seem to find the answer to this anywhere. How can I read in and convert a CSV file with geometry column containing long/lat to sf object. Here is the dput
for the file
structure(list(date = c("2017-08-04", "2017-08-04", "2017-08-04", "2017-08-04", "2017-08-04", "2017-08-04"),
is_boarded = c("0", "0", "0", "0", "1", "0"),
fire = c("0", "0", "0", "0", "0", "0"),
homeless = c("0", "0", "1", "1", "0", "0"),
address = c("1231 N harding ave", "5942 S peoria st", "6440 S seeley ave", "6428 S paulina st", "9015 S houston ave", "10917 S buffalo ave"),
zip_code = c("60651", "60621", "60636", "60636", "60617", "60617"),
ward = c("37", "16", "16", "15", "10", "10"),
community_area = c("23", "68", "67", "67", "46", "52"),
geometry = c("c(-87.7251002085875, 41.903236038454)", "c(-87.6473828702868, 41.7862165473861)", "c(-87.6750561873273, 41.7767719172303)", "c(-87.6666233031588, 41.7770234233244)", "c(-87.5499059450373, 41.731640678147)", "c(-87.5437832254962, 41.6970145984798)"),
PRI_NEIGH = c("Humboldt Park", "Englewood", "Englewood", "Englewood", "South Chicago", "East Side")
),
row.names = c(NA, 6L),
class = "data.frame"
)
Best Answer
Here's one way to approach it:
The
gsub
is removing any matching parentheses. We use\\
to escape them in the regex code. The|
indicates we want to match(
,)
and thec
character. Then useseparate
to separate the remaining values in the geometry column by a comma into two new columns.st_as_sf
will take your lat and lon coordinates that we created in the previous step and convert to ansf
object. I guessed on the crs.