[GIS] GWR – “significant” local regions

geographically-weighted-regression

I ran a Geographically Weighted Regression (GWR) model to find out, in which local regions my independent variables have a strong, moderate or weak relationship.
After doing so, I only want to focus my further analysis on regions with a strong relationship, but the GWR tool in ArcGIS does not provide any p-values for the regions.

Which approach would be appropriate to choose "significant" regions? Would it make sense to include all regions where the coefficient value is higher than the mean, or than +1 standard deviation?

Best Answer

It looks like ESRI has purposefully removed p-values from the software due to the dubious nature of a p-value when it comes to GWR.

"I do know that our consultant's GWR software (Fotheringham, Charlton and Martin) does compute p-values for each coefficient in every one of the local linear equations. However, [we've removed it] because doing so is really not appropriate (and we've discussed this with our consultants and they agree with that assessment)..." - emphasis and context added

However, You can always calculate a local or pseudo t-value by

  1. Adding a Field
  2. Using Field Calculator: divide the local Coefficient by the local Standard Error.

Estimating a local p-value: If you want to estimate a p-value, the GWModel R Package uses this formula. tvals is effectively your psuedo-t-score. enp is the effective number of parameters or EffectiveNumber in the ArcGIS output (see my post). pt is a function that estimates the p-value.

pvals <- round(2 * (1 - pt(abs(tvals), enp)), 3)(line 31) (this Mathematics question would give you the necessary sources to calculate a p-value), note that the authors of the GWR Model used the Effective Number of parametersenp here instead of Effective Degrees of Freedom edf. 1-pt() gives you the estimate of the p-value, and then it is multiplied by 2 to represent the two-tailed distribution.

I would say depending on how likely it is absolutely necessary for you to have p-values, I would suggest using a cut-off from a T-distribution for general exploratory understanding.

The above formula came from the gwr.t.adjust() function from the GWModel Package, which can be used to apply adjustments to the raw t-values to attempt to accommodate for increased chances of a Type I error based on sampling the same data over and over again. See this paper pg 25 for more details on this topic.

Related Question