[GIS] More accurate way to calculate area of rasters

areacoordinate systemraster

In my daily work, I am constantly asked to calculate areas of global raster datasets in geographic projection at 30 arc second resolution. These datasets are normally the result of a Combine operation (a typical example is a vegetation classes combined with country layer). To do this, our unit created a raster dataset with the area of each pixel in geographic projection at 30 arc second. With this area grid, a zonalstat is performed to sum the areas for each class. Since I am not sure how this area grid was created, I always wondered whether this approach is more accurate of just reprojecting the raster in a equal-area projection (from simple tests the results of the two methods are similar). Does anybody experienced a similar situation?

Best Answer

There is a relatively simple exact formula for the area of any spherical quadrangle bounded by parallels (lines of latitude) and meridians (lines of longitude). It can be derived straightforwardly using basic properties of the ellipse (of major axis a and minor axis b) that is rotated around its minor axis to produce the ellipsoid. (The derivation makes a nice integral Calculus exercise but I believe would be of little interest on this site.)

The formula is simplified by breaking the calculation down into basic steps.

First, the distance between the east and west boundaries--the meridians l0 and l1--is a fraction of a whole circle equal to q = (l1 - l0)/360 (when the meridians are measured in degrees) or 1 = (l1 - l0) / (2*pi) (when the meridians are measured in radians). Find the area of the entire slice located between parallels f0 and f1 and just multiply that by q.

Second, we will employ a formula for the area of a horizontal slice of the ellipsoid bounded by the Equator (at f0=0) and a parallel at latitude f (=f1). The area of the slice between any two latitudes f0 and f1 (lying on the same hemisphere) will be the difference between the larger and smaller area.

Finally, provided the model is truly an ellipsoid (and not a sphere), the area of such a slice between the Equator and the parallel at latitude f is given by

area(f) = pi * b^2 * (log(zp/zm) / (2*e) + sin(f) / (zp*zm))

where a and b are the lengths of the major and minor axes of the generating ellipse, respectively,

e = sqrt(1 - (b/a)^2)

is its eccentricity, and

zm = 1 - e*sin(f); zp = 1 + e*sin(f)

(This is much simpler than computing with geodesics, which are only approximations to the parallels anyway. Please note the comment by @cffk concerning a way to compute log(zp/zm) in a way that avoids loss of precision at low latitudes.)

Figure

area(f) is the area of the opaque slice from the equator up to latitude f (around 30 degrees north in the illustration. X and Y are geocentric Cartesian coordinate axes shown for reference.

For the WGS 84 ellipsoid, use the constant values

a = 6 378 137 meters,  b =  6 356 752.3142 meters,

entailing

e = 0.08181919084296

(For a spherical model with a = b, the formula becomes indefinite. You have to take a limit as e --> 0 from above, which then reduces to the standard formula 2 * pi * a^2 * sin(f).)

According to these formulas, a 30' by 30' quadrangle based on the Equator has an area of 3077.2300079129 square kilometers, while a 30' by 30' quadrangle touching a pole (which is really just a triangle) has an area of only 13.6086152 square kilometers.

As a check, the formulas applied to all cells of a 720 by 360 grid covering the earth's surface give a total surface area of 4 * pi * (6371.0071809)^2 square kilometers, indicating that the earth's authalic radius should be 6371.0071809 kilometers. This differs from the Wikipedia value only in the last significant figure (about a tenth of a millimeter). (I think Wikipedia's calculations are off a little :-).

As additional checks, I used versions of these formulas to reproduce Appendices 4 and 5 in Lev M. Bugayevskiy & John P. Snyder, Map Projections: A Reference Manual (Taylor & Francis, 1995). Appendix 4 shows arc lengths of 30'-long sections of meridians and parallels, given to the nearest meter. A spot check of results showed perfect agreement. I then recreated the table with 0.0005' increments, rather than 0.5' increments, and numerically integrated the quadrangle areas as estimated with these arc lengths. The total area of the ellipsoid was accurately reproduced to better than eight significant figures. Appendix 5 shows the values of area(f) for f = 0, 1/2, 1, ..., 90 degrees, multiplied by 1/(2*pi). These values are given to the nearest square kilometer. A visual check of values near 0, 45 and 90 degrees showed perfect agreement.


This exact formula can be applied using raster algebra beginning with a grid giving the latitudes of the upper limits of each cell and another giving the latitudes of the lower limits. Each of these is, essentially, a y-coordinate grid. (In each case you might want to create sin(f) and then zm and zp as intermediate results.) Subtract the two results, take the absolute value of that, and multiply by the fraction q obtained in the first step (equal to 0.5/360 = 1/720 for a 30' cell width, for instance). This will be a grid whose values contain the exact areas of each cell (up to the grid's own numerical precision). Just make sure to express the latitudes in the form expected by the sine function: many raster calculators will give you coordinates in degrees but expect radians for their trig functions!


For the record, here are the exact areas of 30' by 30' cells on the WGS 84 ellipsoid from the Equator up to a pole, in intervals of 30', to 11 figures (the same number used for the minor radius b):

3077.2300079,3077.0019391,3076.5458145,3075.8616605,3074.9495164,3073.8094348,3072.4414813,3070.8457347,3069.0222870,3066.9712434,3064.6927222,3062.1868550,3059.4537865,3056.4936748,3053.3066912,3049.8930202,3046.2528597,3042.3864209,3038.2939285,3033.9756204,3029.4317480,3024.6625762,3019.6683833,3014.4494612,3009.0061153,3003.3386648,2997.4474422,2991.3327939,2984.9950800,2978.4346744,2971.6519646,2964.6473522,2957.4212526,2949.9740951,2942.3063230,2934.4183938,2926.3107788,2917.9839636,2909.4384482,2900.6747464,2891.6933866,2882.4949115,2873.0798782,2863.4488581,2853.6024374,2843.5412166,2833.2658109,2822.7768503,2812.0749792,2801.1608571,2790.0351582,2778.6985716,2767.1518013,2755.3955665,2743.4306011,2731.2576543,2718.8774905,2706.2908892,2693.4986451,2680.5015685,2667.3004848,2653.8962347,2640.2896746,2626.4816763,2612.4731271,2598.2649300,2583.8580035,2569.2532818,2554.4517149,2539.4542684,2524.2619238,2508.8756783,2493.2965451,2477.5255533,2461.5637477,2445.4121891,2429.0719545,2412.5441367,2395.8298444,2378.9302026,2361.8463521,2344.5794500,2327.1306692,2309.5011988,2291.6922441,2273.7050264,2255.5407830,2237.2007674,2218.6862492,2199.9985139,2181.1388633,2162.1086151,2142.9091030,2123.5416769,2104.0077025,2084.3085615,2064.4456516,2044.4203864,2024.2341953,2003.8885234,1983.3848318,1962.7245972,1941.9093120,1920.9404843,1899.8196375,1878.5483108,1857.1280585,1835.5604507,1813.8470724,1791.9895239,1769.9894206,1747.8483931,1725.5680867,1703.1501618,1680.5962932,1657.9081707,1635.0874985,1612.1359952,1589.0553936,1565.8474409,1542.5138984,1519.0565410,1495.4771578,1471.7775513,1447.9595378,1424.0249466,1399.9756206,1375.8134157,1351.5402005,1327.1578567,1302.6682785,1278.0733724,1253.3750574,1228.5752643,1203.6759360,1178.6790272,1153.5865040,1128.4003439,1103.1225355,1077.7550785,1052.2999830,1026.7592702,1001.1349711,975.42912705,949.64378940,923.78101904,897.84288636,871.83147097,845.74886152,819.59715539,793.37845851,767.09488512,740.74855748,714.34160569,687.87616739,661.35438752,634.77841811,608.15041795,581.47255240,554.74699308,527.97591765,501.16150951,474.30595754,447.41145586,420.48020351,393.51440422,366.51626611,339.48800143,312.43182627,285.34996030,258.24462644,231.11805066,203.97246162,176.81009042,149.63317034,122.44393648,95.244625564,68.037475592,40.824725575,13.608615243

The values are in square kilometers.


If you would like to approximate these areas or simply understand their behavior better, the formula reduces to a power series following this pattern:

area(f) = 2 * pi * b^2 * z * (1 + (4/3)y + (6/5)y^2 + (8/7)y^3 + ...)

where

z = sin(f), y = (e*z)^2.

(An equivalent formula appears in Bugayevskiy & Snyder, op. cit., equation (2.1).)

Since e^2 is so small (around 1/150 for all ellipsoidal models of the earth) and z lies between 0 and 1, y is small, too. Thus the terms y^2, y^3, ... rapidly get smaller, adding over two more decimal places precision with each term. If we were to ignore y altogether, the formula would be that of the area of a sphere of radius b. The remaining terms can be understood as correcting for the earth's equatorial bulge.


Edit

Some questions have been raised concerning how a geodesic-distance calculation of area compares with these exact formulas. The geodesic distance method approximates each quadrangle by the geodesics, rather than parallels, that connect its corners horizontally, and applies the Euclidean formula for a trapezoid. For small quadrangles, such as 30' quads, this is biased slightly low and has a relative accuracy between 6 and 10 parts per million. Here is a plot of the error for WGS 84 (or any reasonable earth ellipsoid, for that matter):

Figure 2

Thus, if (1) you have easy access to geodesic distance calculations and (2) can tolerate ppm-level error, you might consider using those geodesic calculations and multiplying their results by 1.00000791 to correct the bias. For two more decimal places of precision, subtract pi/2 * cos(2f) / 10^6 from the correction factor: the result will be accurate to within 0.04 ppm.

Related Question