Calculate area of polygon given (x,y) coordinates
The most optimized solution that covers all possible cases, would be to use a geometry package, like shapely, scikit-geometry or pygeos. All of them use C++ geometry packages under the hood. The first one is easy to install via pip:
pip install shapely
and simple to use:
from shapely.geometry import Polygon
pgon = Polygon(zip(x, y)) # Assuming the OP's x,y coordinates
print(pgon.area)
To build it from scratch or understand how the underlying algorithm works, check the shoelace formula:
# e.g. corners = [(2.0, 1.0), (4.0, 5.0), (7.0, 8.0)]
def Area(corners):
n = len(corners) # of corners
area = 0.0
for i in range(n):
j = (i + 1) % n
area += corners[i][0] * corners[j][1]
area -= corners[j][0] * corners[i][1]
area = abs(area) / 2.0
return area
Since this works for simple polygons:
If you have a polygon with holes : Calculate the area of the outer ring and subtrack the areas of the inner rings
If you have self-intersecting rings : You have to decompose them into simple sectors
By analysis of Mahdi's answer, I concluded that the majority of time was spent doing np.roll()
. By removing the need of the roll, and still using numpy, I got the execution time down to 4-5µs per loop compared to Mahdi's 41µs (for comparison Mahdi's function took an average of 37µs on my machine).
def polygon_area(x,y):
correction = x[-1] * y[0] - y[-1]* x[0]
main_area = np.dot(x[:-1], y[1:]) - np.dot(y[:-1], x[1:])
return 0.5*np.abs(main_area + correction)
By calculating the correctional term, and then slicing the arrays, there is no need to roll or create a new array.
Benchmarks:
10000 iterations
PolyArea(x,y): 37.075µs per loop
polygon_area(x,y): 4.665µs per loop
Timing was done using the time
module and time.clock()
maxb's answer gives good performance but can easily lead to loss of precision when coordinate values or the number of points are large. This can be mitigated with a simple coordinate shift:
def polygon_area(x,y):
# coordinate shift
x_ = x - x.mean()
y_ = y - y.mean()
# everything else is the same as maxb's code
correction = x_[-1] * y_[0] - y_[-1]* x_[0]
main_area = np.dot(x_[:-1], y_[1:]) - np.dot(y_[:-1], x_[1:])
return 0.5*np.abs(main_area + correction)
For example, a common geographic reference system is UTM, which might have (x,y) coordinates of (488685.984, 7133035.984)
. The product of those two values is 3485814708748.448
. You can see that this single product is already at the edge of precision (it has the same number of decimal places as the inputs). Adding just a few of these products, let alone thousands, will result in loss of precision.
A simple way to mitigate this is to shift the polygon from large positive coordinates to something closer to (0,0), for example by subtracting the centroid as in the code above. This helps in two ways:
- It eliminates a factor of
x.mean() * y.mean()
from each product - It produces a mix of positive and negative values within each dot product, which will largely cancel.
The coordinate shift does not alter the total area, it just makes the calculation more numerically stable.
Implementation of Shoelace formula could be done in Numpy
. Assuming these vertices:
import numpy as np
x = np.arange(0,1,0.001)
y = np.sqrt(1-x**2)
We can redefine the function in numpy to find the area:
def PolyArea(x,y):
return 0.5*np.abs(np.dot(x,np.roll(y,1))-np.dot(y,np.roll(x,1)))
And getting results:
print PolyArea(x,y)
# 0.26353377782163534
Avoiding for
loop makes this function ~50X faster than PolygonArea
:
%timeit PolyArea(x,y)
# 10000 loops, best of 3: 42 µs per loop
%timeit PolygonArea(zip(x,y))
# 100 loops, best of 3: 2.09 ms per loop.
Timing is done in Jupyter notebook.