This question has come up several times, yet most solutions depend on a transformation to an equal area CRS. These solutions can be slow with large amounts of polygons, and also not accurate in high latitudes.
For example:
How to determine a polygon's area in a metric unit?
Area in KM from Polygon of coordinates
Getting polygon areas using GeoPandas
Calculating the Area by Square Feet with Geopandas
Is there a way to calculate the area of geographic coordinates without transforming?
Best Answer
Overview
Since version 0.7.0 geopandas has embedded the pyproj library as the crs object. pyproj, since version 2.3.0, has the ability to calculate the area of arbitrary polygons on a sphere. (see https://pyproj4.github.io/pyproj/stable/api/geod.html). The source of the math for this method is ultimately the geographiclib library. Thus, there is a straightforward way to do this calculation with minimal overhead and no new dependencies.
There is also an alternative version detailed in this answer which I use here for comparison. This is based on the line integral and Green's theorem. I've adapted it slightly to work with shapely polygons.
The following code block describes these two implementations. Both return the area, in meters^2, of polygons in geographic coordinates.
Accuracy Comparison
Accuracy Methods
To compare the accuracy I used the Natural Earth States 10m shapefile (https://www.naturalearthdata.com/downloads/10m-cultural-vectors/). These more complex polygons are a better real world test as opposed to using squares or rectangles.
All code blocks below assume the above two blocks are loaded.
I selected 15 different states/provinces around the world and obtained their "true" area from wikipedia. Note it is not clear how accurate this "true" area is, but it provides a decent baseline. I made sure to select some areas at high, low, and mid latitudes, and some with numerous islands.
For the comparison I calculated percentage error for the 2 methods.
Accuracy Results
Both methods get extremely close to truth, less than 1% in many cases. Neither has a lower error in all cases. The magnitude of error for each is approximately the same too. So one method is not more accurate than the other, they are essentially equal.
Driver of accuracy
Despite both methods being the same, some errors are still larger than others. What causes the estimates to be off? I tested 3 potential causes.
These results are with some figures.
Only latitude shows any kind of correlation with error. Neither larger sizes nor more points within the polygon contribute to higher error. Also note that the error is still extremely small here.
Drivers of disagreement
I did another test here where I recalculated areas for the full Natural Earth shapefile and compared there disagreement amongst all polygons.
First off, Antarctica has the highest disagreement. Since this land mass is centered around a pole, its likely the calculus begins to fail there.
Excluding Antarctica there is high agreement between the two methods.
Looking at percentage differences to highlight those small discrepancies some interesting patterns emerge. There is an interesting pattern of disagreement between them related to latitude. Around 45 latitude, north and south, is when they essentially agree. There is likely a mathematical explanation for that.
There is no correlation between the disagreement percent and area or number of vertices.
Here is the code to generate the figures.
Speed test
For a speed comparison I recalculated the full Natural Earth shapefile.
The line integral method is clearly faster.
The special case of regular latitude/longitude grids
One final thing to mention is instances where the polygons represent a regular grid of cells which are squares, and where the sides are latitudinal/longitudinal lines. I wrote about this in this answer, and will quickly compare it here for a speed test.
The method implemented
Setting up a regular grid
Regular grid speed test
The cell method, meant only for regular grids and not complex polygons, is significantly quicker than the other two methods. Interestingly, the magnitude of speed increase in the line integral over the pyproj method is not as high as it is with the more complex polygons. This suggests the speed up there is likely due to the vectorized methods used in the line integral method, which offer little speed improvement when polygons consist of only 4 points.
With the regular grid polygons I can also compare the agreement between methods.
The cell method has near perfect agreement with the pyproj method. But, for the largest cells, has some large disagreement with the line integral method. Remember these are not country or province polygons but grid cells on a uniform lat/lon grid. The largest cells in a regular are at the equator, where the line integral method vastly overestimates some.
I traced this back to a grid cell with coords right at the equator. This is likely an edge condition for when grid cells have one of their bounds at 0.
Conclusions:
If you want to calculate the area of polygons with geographic coordinates in geopandas:
lat_lon_cell_area
) is extremely quick, straightforward, and produces approximately the same results as pyproj.Other Considerations:
As of Fall 2021:
Versions used here: