I'm trying to cross-validate the results of an [auto]Krige
output but for some unknown reason the results of both autoKrige.cv
(automap package) and krige.cv
(gstat package) return NA for all values (e.g. var1.pred) other than the observed, yet, autoKrige
works fine generating prediction values so I'm struggling to work out what the issue is.
The code I'm running is based on this code for the example meuse dataset:
data(meuse)
coordinates(meuse) = ~x+y
proj4string(meuse) <- CRS("+init=epsg:28992")
meuse.krg <-autoKrige (zinc~1,meuse)
meuse.cv <- autoKrige.cv (zinc~1,meuse)
This works fine.
And now with my data (these data can be found here: https://www.dropbox.com/sh/4lu25u9eun02vu6/AABfmp0ciUMWEV03hKgIVbQ_a?dl=0)
#load data from Hollicombe S1 shapefile
hs1<- readOGR (".", "Hollicombe_S1_L1-5_A1.2")
#transforms to appropriate UTM projection
hs1 <- spTransform(hs1.data,"+proj=utm +ellps=WGS84 +datum=WGS84 +zone=30")
#autokrige working fine
hs1.krg <- autoKrige (MeanBioHei~1,hs1)
summary(hs1.krg)
krige_output:
Object of class SpatialPixelsDataFrame
Coordinates:
min max
x1 461034.8 461496.4
x2 5588235.5 5588991.2
Is projected: TRUE
proj4string :
[+proj=utm +ellps=WGS84 +datum=WGS84 +zone=30 +towgs84=0,0,0]
Number of points: 4998
Grid attributes:
cellcentre.offset cellsize cells.dim
x1 461034.8 4.084753 114
x2 5588235.5 4.084753 186
Data attributes:
var1.pred var1.var var1.stdev
Min. : -1.252 Min. :635.4 Min. :25.21
1st Qu.: 41.097 1st Qu.:662.6 1st Qu.:25.74
Median : 65.191 Median :676.6 Median :26.01
Mean : 59.831 Mean :682.6 Mean :26.12
3rd Qu.: 81.756 3rd Qu.:689.3 3rd Qu.:26.25
Max. :102.207 Max. :985.3 Max. :31.39
#but then when trying to use autoKrige.cv
hs1.cv <- autoKrige.cv (MeanBioHei~1,hs1)
print(hs1.cv)
$krige.cv_output
coordinates var1.pred var1.var observed residual zscore fold
1 (461031.4, 5588291) NA NA 0.182 NA NA 1
2 (461043.2, 5588294) NA NA 0.197 NA NA 2
3 (461043.3, 5588309) NA NA 0.222 NA NA 3
4 (461055.2, 5588309) NA NA 0.163 NA NA 4
5 (461055.3, 5588328) NA NA 0.240 NA NA 5
6 (461055.3, 5588328) NA NA 0.196 NA NA 6
7 (461067.1, 5588328) NA NA 0.204 NA NA 7
8 (461067.1, 5588328) NA NA 0.228 NA NA 8
9 (461067.3, 5588346) NA NA 0.224 NA NA 9
10 (461067.3, 5588346) NA NA 0.197 NA NA 10
11 (461069.7, 5588365) NA NA 0.237 NA NA 11
12 (461079.2, 5588364) NA NA 0.236 NA NA 12
13 (461079.4, 5588383) NA NA 0.194 NA NA 13
14 (461079.4, 5588383) NA NA 0.222 NA NA 14
15 (461091.2, 5588383) NA NA 0.198 NA NA 15
16 (461091.3, 5588401) NA NA 0.000 NA NA 16
17 (461091.3, 5588401) NA NA 0.197 NA NA 17
18 (461103.2, 5588401) NA NA 0.211 NA NA 18
19 (461103.3, 5588420) NA NA 0.156 NA NA 19
20 (461103.3, 5588420) NA NA 0.190 NA NA 20
21 (461115.1, 5588420) NA NA 0.186 NA NA 21
22 (461115.1, 5588420) NA NA 0.193 NA NA 22
23 (461115.3, 5588438) NA NA 0.178 NA NA 23
24 (461115.3, 5588438) NA NA 0.210 NA NA 24
25 (461127.1, 5588438) NA NA 0.152 NA NA 25
26 (461127.3, 5588457) NA NA 0.194 NA NA 26
27 (461127.3, 5588457) NA NA 0.203 NA NA 27
28 (461139.2, 5588475) NA NA 0.000 NA NA 28
29 (461139.2, 5588475) NA NA 0.224 NA NA 29
30 (461140.7, 5588494) NA NA 0.192 NA NA 30
[continues...]
I've tried using different variables of the data (hs1) and the error persists.
Note in the above examples I've not entered an argument for new_data (a prediction grid) but the error still persists when I use a grid.
Does anyone have any ideas what the problem may be?
Best Answer
The usual problem with kriging is duplicate locations. You can note that even the function
autoKrige
gives you the following warning:You can also compare
length(hs1@coords)
andlength(unique(hs1@coords))
, which gives you results 1104 and 730 respectively.If you use for example the following construct to remove the duplicated points, the cross-validation will run fine:
giving the output:
etc.