I've been using ArcGIS for a couple years, but when it comes to model builder I've only worked with very simple models. I'm hoping someone that is much smarter than myself can help me in building a complicated model.
I've got a very large table of data (about 300k rows) that contains points with lat/lon data, wind speed and date/time. There are about 60 different points with wind values that have been recorded for each hour during 2012. What I'd like to do is build a model that will:
- Iterate through each row and select/export the data associated with first hour (00Z) Jan 1st 2012 (about 60 rows) to a separate table, then the 2nd hour (01Z) of Jan 1st 2012 to its own separate table and so on through all of 2012.
- Perform Inverse Distance Weighting on all of the hourly tables with lat/lon and wind values that were created from step 1 above
- Reclassify the rasters so that points in the raster that have wind values of 5 or greater are reclassified to "1" and all other cell's wind values are reclassified to "0"
- Then iterate through the rasters created in step 3 and use something like "Collect Values" & "Cell Statistics" to sum up all of the rasters into a final summed raster for all hours in 2012.
Was going to include screenshot of my data, but I guess since this is my first post I'm not trusted yet. I've broken the date down several different ways (just date, just time, date-time, and date-time in text format) because I've had issues trying to iterate through and select the data in my original, very large table only by date.
Hopefully I provided enough information, but if not I will try to go into more detail. Thanks in advance for any assistance.
Best Answer
Using Spatial Analyst, we can expect each iteration of IDW, thresholding, and accumulation of results to take from a fraction of a second to a few seconds. Multiplying by 24 * 365 = 8760 hours in a year suggests this calculation will take hours at the least.
It can be much faster. Faster is not always better, but here--because there are significant issues concerning the interpolation method, since IDW is unlikely to be accurate for such data--it will be real advantage to be able to carry out the calculations quickly on a realistic scale in order to see the results and experiment with other options.
To illustrate this and to estimate how quickly it can really be done, I implemented the full-scale calculation in
R
using simulated data. (It is a simple matter to read your data intoR
and export the resulting grid in ESRI format using, say, theRaster
package.) After the usual preliminaries of getting the data into the right format (values are stored in a 60 by 8760 array and point locations in a 60 by 2 array of (x,y) coordinates), handling missing values, and so on, the entire computation comes down to two lines of code:xy
has precomputed the IDW weights (unnormalized).one
indicates where data are not missing andz
is the matrix of all the data. The two matrix multiplicationsxy %*% one
andxy %*% z
compute the denominators and numerators of all the 8760 IDW calculations at once. The comparison in the second line creates the 8760 grids of ones and zeros androwMeans
averages them for the entire year. The result,a
, is a single grid of dimensionsn.x
andn.y
.By reducing this large number of linear interpolations to a few matrix operations, the fastest possible speed and efficiency can be attained. Moreover, this approach is general: literally any linear interpolation method (such as Kriging) can be implemented in this fashion.
On my workstation (running this on a single core even though massive parallelization is possible) it takes just under 60 seconds to process one-quarter of the grid. By reducing the grid resolution little detail is lost and the entire year's data can be processed in seconds.
To illustrate, here are the results for 60 points and 8760 hours on a 125 by 310 grid (half resolution; 58 seconds total time). For simplicity it uses all available data for the interpolation. If smaller local neighborhoods are desired, they would be involved in one (very fast) precomputation step, where the weights
xy
are computed, but otherwise nothing else would change.The windspeeds during the first hour are plotted as circles whose areas are proportional to the average speed during the year; the averages range from 1 to 5.9 m/sec.
There is a price to pay: this method precomputes some values and stores them in RAM. This quarter-size calculation required 6.75 GB RAM. A full-size calculation would therefore request 27 GB RAM. If you don't have that much, though, it is a simple matter to carve up the grid into smaller rectangles, do the calculation for each rectangle separately, and mosaic the pieces afterwards. The total calculation time would not appreciably increase.
Complete code follows.