I have used OLS and GWR for validating the dependencies between two distinct database. The Residual Square for GWR is 0.82 and thereby making it the right regression model to be used for determining the relationship between the two datasets.
What I wanted to know that, GWR being Local regression and OLS being global regression, which should be used where and when?
Also what does it really mean if Moran's I for the GWR model is Random?
Best Answer
What these procedures are
Although OLS and GWR share many aspects of their statistical formulation, they are used for different purposes:
Here is an OLS fit:
Here is a locally weighted smooth. Notice how it can follow the apparent "wiggles" in the data, but does not pass exactly through every point. (It can be made to pass through the points, or to follow smaller wiggles, by changing a setting in the procedure, exactly as GWR can be made to follow spatial data more or less exactly by changing settings in its procedure.)
Intuitively, think of OLS as fitting a rigid shape (such as a line) to the scatterplot of (x,y) pairs and GWR as allowing that shape to wiggle arbitrarily.
Choosing between them
In the present case, although it is not clear what "two distinct databases" might mean, it seems that using either OLS or GWR to "validate" a relationship between them may be inappropriate. For instance, if the databases represent independent observations of the same quantity at the same set of locations, then (1) OLS is probably inappropriate because both x (the values in one database) and y (the values in the other database) should be conceived of as varying (instead of thinking of x as fixed and accurately represented) and (2) GWR is fine for exploring the relationship between x and y, but it cannot be used to validate anything: it's guaranteed to find relationships, not matter what. Moreover, as previously remarked, the symmetric roles of "two databases" indicate that either could be chosen as 'x' and the other as 'y', leading to two possible GWR results which are guaranteed to differ.
Here is a locally weighted smooth of the same data, reversing the roles of x and y. Compare this to the previous plot: notice how much steeper the overall fit is and how it differs in the details, too.
Different techniques are required to establish that two databases are providing the same information, or to assess their relative bias, or relative precision. The choice of technique depends on the statistical properties of the data and the purpose of the validation. As an example, databases of chemical measurements will typically be compared using calibration techniques.
Interpreting Moran's I
It is hard to tell what a "Moran's I for the GWR model" means. I guess that a Moran's I statistic may have been computed for the residuals of a GWR calculation. (The residuals are the differences between actual and fitted values.) Moran's I is a global measure of spatial correlation. If it is small, it suggests that variations between the y-values and the GWR fits from the x-values have little or no spatial correlation. When GWR is "tuned" to the data (this involves deciding on what really constitutes a "neighbor" of any point), low spatial correlation in the residuals is to be expected because GWR (implicitly) exploits any spatial correlation among the x and y values in its algorithm.