Creating Lines Between Point Pairs in R – Spatial Analysis

rsfsp

I have a sf dataframe table_sf as:

Classes ‘sf’, ‘tbl_df’, ‘tbl’ and 'data.frame': 12251 obs. of  5 variables:
 $ ID      : int  1 2 3 4 5 6 7 8 9 10 ...
 $ NOMBRE  : chr  "AL011900" "AL011900" "AL011900" "AL011900" ...
 $ FECHA   : POSIXct, format: "1900-08-27 00:00:00" "1900-08-27 06:00:00" "1900-08-27 12:00:00" "1900-08-27 18:00:00" ...
 $ INT     : num  18 18 18 18 18 ...
 $ geometry:sfc_POINT of length 12251; first list element:  'XY' num  -42.1 15
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA
  ..- attr(*, "names")= chr  "ID" "NOMBRE" "FECHA" "INT"

I will like to create lines between sequential points per each NOMBRE and each line must have the INT column with the value of the first point used to create it.

That will be: the line from point 1 to point 2 will have the INT value of point 1. The line from point 2 to point 3 will have the INT value of point 2. And so on for each of the NOMBRE

My approach has been to loop through each pair of points of each NOMBRE and create the line using the st_cast function of the sf package BUT I'm still far from making it work.

Here is the code I have so far (packages tidyverse & sf):

for (i in table_sf %>% group_by(NOMBRE) %>% summarise()) {
  table_huracanes <- table_sf %>% 
    filter(NOMBRE == i) %>% 
    mutate(numb = row_number())
  for (h in table_huracanes$numb) {
    linea <- table_huracanes %>% 
      filter(numb <= 2) %>% 
      group_by(NOMBRE) %>% 
      summarise() %>% 
      st_cast("LINESTRING")
  }
}

I think I'm still missing a couple of things:

  • Find a way in which I can select only two points for each of the
    second for loop.
  • Create an empty sf dataframe (linestring or multilinestring?) where
    to append each linea.

To clarify, the table before creating the sf dataframe looks like this:

NOMBRE     LAT     LONG      FECHA                INT
AL011900    15  -42.1   1900-08-27T00:00:00Z    18.0054
AL011900    15.2    -43.4   1900-08-27T06:00:00Z    18.0054
AL011900    15.3    -44.7   1900-08-27T12:00:00Z    18.0054
AL011900    15.4    -45.6   1900-08-27T18:00:00Z    18.0054
AL021900    19  -59.3   1900-09-13T12:00:00Z    33.4386
AL021900    19.5    -60 1900-09-13T18:00:00Z    36.0108
AL021900    20  -60.6   1900-09-14T00:00:00Z    38.583
AL041905    36.3    -48.6   1905-10-10T18:00:00Z    46.2996
AL041905    37.9    -47.9   1905-10-11T00:00:00Z    43.7274
AL041905    39.6    -47.1   1905-10-11T06:00:00Z    41.1552
AL041905    41  -46 1905-10-11T12:00:00Z    41.1552

Note that for each NOMBRE there are different INT values. That's the reason why I need to create lines between each pair of points.

Best Answer

Here is a tidyverse method of doing it, starting with your table from before converting to sf. The approach is to create a long-form table where each row is a start or end point, but include a lineid so that you can group_by on it and summarise to union the right points together, and then st_cast to LINESTRING.

library(tidyverse)
library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.2.3, proj.4 4.9.3
table <- structure(list(NOMBRE = c("AL011900", "AL011900", "AL011900", "AL011900", "AL021900", "AL021900", "AL021900", "AL041905", "AL041905", "AL041905", "AL041905"), LAT = c(15, 15.2, 15.3, 15.4, 19, 19.5, 20, 36.3, 37.9, 39.6, 41), LONG = c(-42.1, -43.4, -44.7, -45.6, -59.3, -60, -60.6, -48.6, -47.9, -47.1, -46), INT = c(18.0054, 18.0054, 18.0054, 18.0054, 33.4386, 36.0108, 38.583, 46.2996, 43.7274, 41.1552, 41.1552)), row.names = c(NA, -11L), class = c("tbl_df", "tbl", "data.frame"), spec = structure(list(cols = list(NOMBRE = structure(list(), class = c("collector_character", "collector")), LAT = structure(list(), class = c("collector_double", "collector")), LONG = structure(list(), class = c("collector_double", "collector")), FECHA = structure(list(format = ""), class = c("collector_datetime", "collector")), INT = structure(list(), class = c("collector_double", "collector"))), default = structure(list(), class = c("collector_guess", "collector"))), class = "col_spec"))

table_sf <- table %>%
  group_by(NOMBRE) %>%
  mutate(
    lineid = row_number(), # create a lineid
    LONG_end = lead(LONG), # create the end point coords for each start point
    LAT_end = lead(LAT)
  ) %>% 
  unite(start, LONG, LAT) %>% # collect coords into one column for reshaping
  unite(end, LONG_end, LAT_end) %>%
  filter(end != "NA_NA") %>% # remove nas (last points in a NOMBRE group don't start lines)
  gather(start_end, coords, start, end) %>% # reshape to long
  separate(coords, c("LONG", "LAT"), sep = "_") %>% # convert our text coordinates back to individual numeric columns
  mutate_at(vars(LONG, LAT), as.numeric) %>%
  st_as_sf(coords = c("LONG", "LAT")) %>% # create points
  group_by(NOMBRE, INT, lineid) %>%
  summarise() %>% # union points into lines using our created lineid
  st_cast("LINESTRING")

plot(table_sf[, 1:2])

You can see in the plot that each line between two points has its own INT as requested.

Example