Solved – Pattern recognition techniques in spatial or spatio-temporal data

machine learningrsimilaritiesspatialspatio-temporal

I am working with weather forecasters and have access to historical climatology data. Given current weather conditions in an area of interest (i.e. the current "map"), we want to try to find the most similar "map" from the past data. The idea is to try and make a weather forecast by finding the best analog from past data.

The data is represented as a regular X by Y grid (i.e. a matrix) of points, where X is the horizontal location and Y is the vertical location, and the (X,Y)th value in the matrix represents the response variable Z at that location. In addition, the grid points are evenly spaced out. As an example, Z can be a measure of surface temperature, which is measured at each of the grid points.

We take care of seasonal effects by restricting the search in the past data to a window of +/- 15 days of the test date. For example, if we want to find the best analog for a map from 2013-06-19, we would only consider maps from 2012-06-19 +/- 15 days, 2011-06-19 +/- 15 days, etc. We also restrict the search to observations taken at the same time as the test date. For example, if the test data is an observation taken at noon, then we will only look at the past data taken from the same time.

I have two questions.

(1) Given two grids (or "maps" or matrices) of data, how can I best calculate the similarity between them? Are there methods that take into account the spatial nature of the data? For example, point (1,1) will be highly correlated with the nearby point (1,2), etc.

I am currently using a very simple distance metric, where I just take the difference of the two maps and find the Frobenius norm. The map from the past that yields the smallest value is the 'closest' map to the test conditions.

(2) I am new to spatial statistics and I am looking for literature that relates to what I am trying to do. What should I read to become familiar with working with grid data? What resources are there to learn about pattern recognition in spatial or spatio-temporal data?

(I want to mention that I am working in R, so I would welcome package recommendations as well!)

Best Answer

I don't have a cookbook answer but here are some initial thoughts:

I think that you idea about Frobenius norm is not unreasonable and can serve as a first safe bet indeed. I think you can use quite a few different metrics for matrix distances but I will propose two based on you data's nature:
1. Given that what you are looking in each climatic map is the realization of a 2D Gaussian Process in space, it might be interesting to go ahead and estimate for each map the hyper-parameters $\theta_{MAP}$ of it. Then you can treat $\theta_{MAP}$ as containing information about the underlying dynamics of your process. Comparing the vectors $\theta$ will give an idea of similarity between any two maps. (You could even cluster them after that.) For starters a "standard" covariance function as the summation of a squared exponential and a Gaussian noise one should do just fine. It would probably interesting to think how would you "zero-centre" your maps. You might need to look up kriging a bit more carefully (understand the difference between simple and ordinary kriging for example and you'll immediately see what I mean by "zero-centring" your maps. (It will depend if your see your maps are coming from the same stationary process or not)
2. You treat all map instances you as being samples from the same forward model. You go ahead compute the eigen-maps on them and then you compare the difference you are seeing in their projection scores generate by the eigen-maps. Easiest reference for this is... Eigenfaces. Really no joke, just read the article and each time it reads "face", read "climatic map". Everything is there. Don't get out of of the PCA step; your covariance matrix will be $N \times N$ where $N$ is your sample size not your map size.
Kriging: If you are working in spatial statistics it is of paramount importance to understand it. Everything else if practically done in extension or in parallel to this main technique. Understand what a variogram shows and how to read one. Gaussian process regression literature might also be helpful for a first read; GPR essentially is simple kriging and usually the text describing GPR are less technical. For actual references on the matter I will refer direct to the instructions given by Peter Diggle about this:

Cressie (1991) remains a standard reference for spatial statistical models and methods. Possibly more accessible accounts (...) are: the introductory chapters of Rue and Held (2005) on discrete spatial variation, Diggle and Ribeiro (2007) on geostatistics and Diggle (2003) on point processes. Waller and Gotway (2004) cover all three sub-areas at an introductory level, with a focus on public health applications. Gelfand et al (2010) is an edited compilation that covers both spatial and spatio-temporal models and methods.

For a machine learning perceptive on Gaussian Processes I definitely refer to Gaussian Processes for Machine Learning by Rasmussen and Williams. Personally I have used the Diggle & Ribeiro and the Rasmussen & Williams books a lot. Cressie has a lot of nice papers on the subject. I don't know your level of mathematical expertise but it's a fun subject and I think you can gain traction relatively easily. When all is said and done, you just interpolate between points. Good luck.

Ah, when it comes to software I think going to the CRAN's Task View options on Temporal and SpatioTemporal is the best starting step.

Best Answer

Related Solutions

Solved – Significance of Spatial Data

Solved – Data partitioning for spatial data

Related Question