Drawing line between points at specific distance in PostGIS?
As @FelixIP points out, the first step is to find the points that will make up each line. You can do this by calling ST_ClusterWithin with your maximum separation distance:
SELECT
row_number() OVER () AS cid,
(ST_Dump(geom)).geom
FROM (
SELECT unnest(st_clusterwithin(geom, 0.05)) AS geom
FROM inputs) sq
Then, you'll need to use some heuristic to build a line through all of the points in each cluster. For example, if you can assume the desired lines to be Y-monotone, you can sort the points in each cluster and feed them into ST_MakeLine. Combining that all together would look like this:
SELECT
ST_MakeLine(geom ORDER BY ST_Y(geom)) AS geom
FROM (
SELECT row_number() OVER () AS cid,
(ST_Dump(geom)).geom FROM (
SELECT unnest(st_clusterwithin(geom, 0.05)) AS geom
FROM inputs) sq) ssq
GROUP BY cid
You can use a recursive query to explore nearest neighbor of each point starting from each detected end of lines you want to build.
Prerequisites : prepare a postgis layer with your points and another with a single Multi-linestring object containing your roads. The two layers must be on the same CRS. Here is the code for the test data-set I created, please modify it as needed. (Tested on postgres 9.2 and postgis 2.1)
WITH RECURSIVE
points as (SELECT id, st_transform((st_dump(wkb_geometry)).geom,2154) as geom, my_comment as com FROM mypoints),
roads as (SELECT st_transform(ST_union(wkb_geometry),2154) as geom from highway),
Here are the steps:
Generate for each point the list of every neighbors and theirs distance that meet theses three criteria.
- Distance must not exceed a user defined threshold (this will avoid linking to isolated point)
graph_full as ( SELECT a.id, b.id as link_id, a.com, st_makeline(a.geom,b.geom) as geom, st_distance(a.geom,b.geom) as distance FROM points a LEFT JOIN points b ON a.id<>b.id WHERE st_distance(a.geom,b.geom) <= 15 ),
- Direct path must not cross a road
graph as ( SELECt graph_full.* FROM graph_full RIGHT JOIN roads ON st_intersects(graph_full.geom,roads.geom) = false ),
Distance must not exceed a user defined ratio of the distance from the nearest neighbor (this should accommodate better to irregular digitalization than fixed distance)This part was actually too hard to implement, sticked to fixed search radius
Let's call this table "the graph"
- Distance must not exceed a user defined threshold (this will avoid linking to isolated point)
Select end of line point by joining to the graph and keeping only point that have exactly one entry in the graph.
eol as ( SELECT points.* FROM points JOIN (SELECT id, count(*) FROM graph GROUP BY id HAVING count(*)= 1) sel ON points.id = sel.id),
Let's call this table "eol" (end of line)
easy? that the reward for doing a great graph but hold-on things will go crazy at next stepSet up a recursive query that will cycle from neighbors to neighbors starting from each eol
- Initialize the recursive query using eol table and adding a counter for the depth, an aggregator for the path, and a geometry constructor to build the lines
- Move to next iteration by switching to nearest neighbor using the graph and checking that you never go backward using the path
- After the iteration is finished keep only the longest path for each starting point (if your dataset include potential intersection between expect lines that part would need more conditions)
recurse_eol (id, link_id, depth, path, start_id, geom) AS (--initialisation SELECT id, link_id, depth, path, start_id, geom FROM ( SELECT eol.id, graph.link_id,1 as depth, ARRAY[eol.id, graph.link_id] as path, eol.id as start_id, graph.geom as geom, (row_number() OVER (PARTITION BY eol.id ORDER BY distance asc))=1 as test FROM eol JOIn graph ON eol.id = graph.id ) foo WHERE test = true UNION ALL ---here start the recursive part SELECT id, link_id, depth, path, start_id, geom FROM ( SELECT graph.id, graph.link_id, r.depth+1 as depth, path || graph.link_id as path, r.start_id, ST_union(r.geom,graph.geom) as geom, (row_number() OVER (PARTITION BY r.id ORDER BY distance asc))=1 as test FROM recurse_eol r JOIN graph ON r.link_id = graph.id AND NOT graph.link_id = ANY(path)) foo WHERE test = true AND depth < 1000), --this last line is a safe guard to stop recurring after 1000 run adapt it as needed
Let's call this table "recurse_eol"
Keep only longest line for each start point and remove every exact duplicate path Example : paths 1,2,3,5 AND 5,3,2,1 are the same line discovered by it's two differents "end of line"
result as (SELECT start_id, path, depth, geom FROM (SELECT *, row_number() OVER (PARTITION BY array(SELECT * FROM unnest(path) ORDER BY 1))=1 as test_duplicate, (max(depth) OVER (PARTITION BY start_id))=depth as test_depth FROM recurse_eol) foo WHERE test_depth = true AND test_duplicate = true) SELECT * FROM result
Manually checks remaining errors (isolated points, overlapping lines, weirdly shaped street)
Updated as promised, I still can't figure out why sometimes recursive query don't give exact same result when starting from opposite eol of a same line so some duplicate may remain in result layer as of now.
Feel free to ask I totally get that this code need more comments. Here is the full query:
WITH RECURSIVE
points as (SELECT id, st_transform((st_dump(wkb_geometry)).geom,2154) as geom, my_comment as com FROM mypoints),
roads as (SELECT st_transform(ST_union(wkb_geometry),2154) as geom from highway),
graph_full as (
SELECT a.id, b.id as link_id, a.com, st_makeline(a.geom,b.geom) as geom, st_distance(a.geom,b.geom) as distance
FROM points a
LEFT JOIN points b ON a.id<>b.id
WHERE st_distance(a.geom,b.geom) <= 15
),
graph as (
SELECt graph_full.*
FROM graph_full RIGHT JOIN
roads ON st_intersects(graph_full.geom,roads.geom) = false
),
eol as (
SELECT points.* FROM
points JOIN
(SELECT id, count(*) FROM graph
GROUP BY id
HAVING count(*)= 1) sel
ON points.id = sel.id),
recurse_eol (id, link_id, depth, path, start_id, geom) AS (
SELECT id, link_id, depth, path, start_id, geom FROM (
SELECT eol.id, graph.link_id,1 as depth,
ARRAY[eol.id, graph.link_id] as path,
eol.id as start_id,
graph.geom as geom,
(row_number() OVER (PARTITION BY eol.id ORDER BY distance asc))=1 as test
FROM eol JOIn graph ON eol.id = graph.id
) foo
WHERE test = true
UNION ALL
SELECT id, link_id, depth, path, start_id, geom FROM (
SELECT graph.id, graph.link_id, r.depth+1 as depth,
path || graph.link_id as path,
r.start_id,
ST_union(r.geom,graph.geom) as geom,
(row_number() OVER (PARTITION BY r.id ORDER BY distance asc))=1 as test
FROM recurse_eol r JOIN graph ON r.link_id = graph.id AND NOT graph.link_id = ANY(path)) foo
WHERE test = true AND depth < 1000),
result as (SELECT start_id, path, depth, geom FROM
(SELECT *,
row_number() OVER (PARTITION BY array(SELECT * FROM unnest(path) ORDER BY 1))=1 as test_duplicate,
(max(depth) OVER (PARTITION BY start_id))=depth as test_depth
FROM recurse_eol) foo
WHERE test_depth = true AND test_duplicate = true)
SELECT * FROM result