Calculating average width of polygon?
Part of the problem is to find a suitable definition of "average width." Several are natural but will differ, at least slightly. For simplicity, consider definitions based on properties that are easy to compute (which is going to rule out those based on the medial axis transform or sequences of buffers, for instance).
As an example, consider that the archetypal intuition of a polygon with a definite "width" is a small buffer (say of radius r with squared ends) around a long, fairly straight polyline (say of length L). We think of 2r = w as its width. Thus:
Its perimeter P is approximately equal to 2L + 2w;
Its area A is approximately equal to w L.
The width w and length L can then be recovered as roots of the quadratic x^2 - (P/2)x + A; in particular, we can estimate
- w = (P - Sqrt(P^2 - 16A))/4.
When you're sure the polygon really is long and skinny, as a further approximation you can take 2L + 2w to equal 2L, whence
- w (crudely) = 2A / P.
The relative error in this approximation is proportional to w/L: the skinnier the polygon, the closer w/L is to zero, and the better the approximation gets.
Not only is this approach extremely simple (just divide the area by the perimeter and multiply by 2), with either formula it doesn't matter how the polygon is oriented or where it is situated (because such Euclidean motions change neither the area nor the perimeter).
You might consider using either of these formulas to estimate average width for any polygons that represent street segments. The error you make in the original estimate of w (with the quadratic formula) comes about because the area A also includes tiny wedges at each bend of the original polyline. If the sum of bend angles is t radians (this is the total absolute curvature of the polyline), then really
P = 2L + 2w + 2 Pi t w and
A = L w + Pi t w^2.
Plug these into the previous (quadratic formula) solution and simplify. When the smoke clears, the contribution from the curvature term t has disappeared! What originally looked like an approximation is perfectly accurate for non-self-intersecting polyline buffers (with squared ends). For variable-width polygons, this is therefore a reasonable definition of average width.
Here I show little optimization about @whuber solution, and I'm putting in terms of "buffer width", because it is useful for integrate the solution of a more general problem: Is there a st_buffer inverse function, that returns a width estimation?
CREATE FUNCTION buffer_width(
-- rectangular strip mean width estimator
p_len float, -- len of the central line of g
p_geom geometry, -- g
p_btype varchar DEFAULT 'endcap=flat' -- st_buffer() parameter
) RETURNS float AS $f$
DECLARE
w_half float;
w float;
BEGIN
w_half := 0.25*ST_Area(p_geom)/p_len;
w := 0.50*ST_Area( ST_Buffer(p_geom,-w_half,p_btype) )/(p_len-2.0*w_half);
RETURN w_half+w;
END
$f$ LANGUAGE plpgsql IMMUTABLE;
For this problem, the @celenius question about street width, sw
, the solution is
sw = buffer_width(ST_Length(g1), g2)
where sw
is the "average width", g1
the central line of g2
, and the street g2
is a POLYGON. I used only the OGC standard library, tested with PostGIS, and solved other serious practical applications with the same buffer_width function.
DEMONSTRATION
A2
is the area of g2
, L1
the length of the central line (g1
) of g2
.
Supposing that we can generate g2
by g2=ST_Buffer(g1,w)
, and that g1
is a straight, so g2
is a rectangle with lenght L1
and width 2*w
, and
A2 = L1*(2*w) --> w = 0.5*A2/L1
It is not the same formula of @whuber, because here w
is a half of the rectangle (g2
) width. It is a good estimator, but as we can see by tests (below), is not exact, and the function use it as a clue, to reduce the g2
area, and as a final estimator.
Here we do not evaluate buffers with "endcap=square" or "endcap=round", that need a sum to A2
of an area of a point buffer with the same w
.
REFERENCES: in a similar forum of 2005, W. Huber explains suchlike and other solutions.
TESTS AND REASONS
For straight lines the results, as expected, are exact. But for other geometries the results can be disappointing. The main reason is, perhaps, all the model is for exact rectangles, or for geometries that can be approximated to a "strip rectangle". Here a "test kit" for check the limits of this approximation (see wfactor
in the results above).
SELECT *, round(100.0*(w_estim-w)/w,1) as estim_perc_error
FROM (
SELECT btype, round(len,1) AS len, w, round(w/len,3) AS wfactor,
round( buffer_width(len, gbase, btype) ,2) as w_estim ,
round( 0.5*ST_Area(gbase)/len ,2) as w_near
FROM (
SELECT
*, st_length(g) AS len, ST_Buffer(g, w, btype) AS gbase
FROM (
-- SELECT ST_GeomFromText('LINESTRING(50 50,150 150)') AS g, -- straight
SELECT ST_GeomFromText('LINESTRING(50 50,150 150,150 50,250 250)') AS g,
unnest(array[1.0,10.0,20.0,50.0]) AS w
) AS t,
(SELECT unnest(array['endcap=flat','endcap=flat join=bevel']) AS btype
) AS t2
) as t3
) as t4;
RESULTS:
WITH RECTANGLES (central line is a STRAIGHT LINE):
btype | len | w | wfactor | w_estim | w_near | estim_perc_error
------------------------+-------+------+---------+---------+--------+------------------
endcap=flat | 141.4 | 1.0 | 0.007 | 1 | 1 | 0
endcap=flat join=bevel | 141.4 | 1.0 | 0.007 | 1 | 1 | 0
endcap=flat | 141.4 | 10.0 | 0.071 | 10 | 10 | 0
endcap=flat join=bevel | 141.4 | 10.0 | 0.071 | 10 | 10 | 0
endcap=flat | 141.4 | 20.0 | 0.141 | 20 | 20 | 0
endcap=flat join=bevel | 141.4 | 20.0 | 0.141 | 20 | 20 | 0
endcap=flat | 141.4 | 50.0 | 0.354 | 50 | 50 | 0
endcap=flat join=bevel | 141.4 | 50.0 | 0.354 | 50 | 50 | 0
WITH OTHER GEOMETRIES (centerline folded):
btype | len | w | wfactor | w_estim | w_near | estim_perc_error
-----------------------+-----+------+---------+---------+--------+------------------
endcap=flat | 465 | 1.0 | 0.002 | 1 | 1 | 0
endcap=flat join=bevel | 465 | 1.0 | 0.002 | 1 | 0.99 | 0
endcap=flat | 465 | 10.0 | 0.022 | 9.98 | 9.55 | -0.2
endcap=flat join=bevel | 465 | 10.0 | 0.022 | 9.88 | 9.35 | -1.2
endcap=flat | 465 | 20.0 | 0.043 | 19.83 | 18.22 | -0.9
endcap=flat join=bevel | 465 | 20.0 | 0.043 | 19.33 | 17.39 | -3.4
endcap=flat | 465 | 50.0 | 0.108 | 46.29 | 40.47 | -7.4
endcap=flat join=bevel | 465 | 50.0 | 0.108 | 41.76 | 36.65 | -16.5
wfactor= w/len
w_near = 0.5*area/len
w_estim is the proposed estimator, the buffer_width function.
About btype
see ST_Buffer guide, with good ilustratins and the LINESTRINGs used here.
CONCLUSIONS:
- the estimator of
w_estim
is always better thanw_near
; - for "near to rectangular"
g2
geometries, its ok, anywfactor
- for another geometries (near to "rectangular strips"), use the limit
wfactor=~0.01
for 1% of error onw_estim
. Up to this wfactor, use another estimator.
Caution and prevention
Why the estimation error occurs?
When you use ST_Buffer(g,w)
, you expect, by the "rectangular strip model", that the new area added by the buffer of width w
is about w*ST_Length(g)
or w*ST_Perimeter(g)
... When not, usually by overlays (see folded lines) or by "styling", is when the estimation of average w
fault. This is the main message of the tests.
To detect this problem at any king of buffer, check the behavior of the buffer generation:
SELECT btype, w, round(100.0*(a1-len1*2.0*w)/a1)::varchar||'%' AS straight_error,
round(100.0*(a2-len2*2.0*w)/a2)::varchar||'%' AS curve2_error,
round(100.0*(a3-len3*2.0*w)/a3)::varchar||'%' AS curve3_error
FROM (
SELECT
*, st_length(g1) AS len1, ST_Area(ST_Buffer(g1, w, btype)) AS a1,
st_length(g2) AS len2, ST_Area(ST_Buffer(g2, w, btype)) AS a2,
st_length(g3) AS len3, ST_Area(ST_Buffer(g3, w, btype)) AS a3
FROM (
SELECT ST_GeomFromText('LINESTRING(50 50,150 150)') AS g1, -- straight
ST_GeomFromText('LINESTRING(50 50,150 150,150 50)') AS g2,
ST_GeomFromText('LINESTRING(50 50,150 150,150 50,250 250)') AS g3,
unnest(array[1.0,20.0,50.0]) AS w
) AS t,
(SELECT unnest(array['endcap=flat','endcap=flat join=bevel']) AS btype
) AS t2
) as t3;
RESULTS:
btype | w | straight_error | curve2_error | curve3_error
------------------------+------+----------------+--------------+--------------
endcap=flat | 1.0 | 0% | -0% | -0%
endcap=flat join=bevel | 1.0 | 0% | -0% | -1%
endcap=flat | 20.0 | 0% | -5% | -10%
endcap=flat join=bevel | 20.0 | 0% | -9% | -15%
endcap=flat | 50.0 | 0% | -14% | -24%
endcap=flat join=bevel | 50.0 | 0% | -26% | -36%
If you can join your polygon data to your centerline data (by either spatial or tabular means), then just sum the polygon areas for each centerline alignment and divide by the length of the centerline.