I'm trying to replicate a study on gentrification of neighbourhoods and I'm dealing with some missing values of mean rent and selling prices for some neighbourhoods. Since these values are very strongly spatially autocorrelated, I've thought of using their lagged value (average of neighbours' values) as a proxy. The problem is that if any of the neighbours has a NA value, then the function I'm using returns NA as a lagged value. Here's the data I'm using.
And here is the code.
library(spdep)
library(sf)
nh <- st_read("path/neighbourhoods.geojson")
# generate neighbour and weight lists for each unit
nb <- poly2nb(nh)
lw <- nb2listw(nb)
# generate lagged values for the mean rent price in 2014.
lag.listw(x = lw, var = nh$mean_renting_price2014, NAOK = TRUE, zero.policy = TRUE, na.action = na.rm)
The result I'm getting is the following.
[1] NA 642.2900 689.4438 713.5050 813.3567 782.0900 746.1675 744.7970 859.8880 704.0800 755.5360 953.4275 1091.8675 1066.2850
[15] 1323.8140 NA 591.3333 NA 600.6350 640.9443 660.9120 621.4620 719.3700 698.3483 980.4478 950.5900 1023.2114 910.0057
[29] 842.2375 665.2929 590.2100 613.8743 802.3717 727.7780 631.9620 597.7475 554.8180 623.5500 565.3150 NA 721.8014 NA
[43] 853.0167 578.7550 NA NA 567.7075 NA 599.3863 679.8914 569.1567 538.3950 500.7200 NA NA NA
[57] 490.8133 615.4525 NA NA NA NA NA NA NA 592.3787 608.3825 614.7420 643.2975 851.1700
[71] 584.1100 603.8033 777.3600
Best Answer
Computing some quantity of neighbours can be done using
sapply
on the neighbours list. This is the same as your calculation:but can be changed to drop NAs in the mean calculation:
Leaving one
NaN
which I bet is a region with no non-NA neighbours. Its item 63. What's its neighbours?and what's their value?
You could fill the values for the initial NA values from this first calculation, and then do it again to get values for "doubly NA-neighboured" regions....