TransWikia.com

Clustering points in PostGIS

Geographic Information Systems Asked by Karthik Katragadda on December 26, 2020

I am trying to cluster points in PostGIS. I have the points and the corresponding Latitudes and Longitudes which I have converted into point geometries. I want to cluster points in such a way that all the points within a cluster are within 25 miles of each other.

My input looks like this:

enter image description here

My desired output is this:

enter image description here

Here the new column Cluster_id denotes the cluster to which the point is assigned to. Please tell me the right query to get this done. I am new to SQL so I am finding hard to write the right query.

So far, I have tried this:

SELECT ST_AsText(unnest(ST_ClusterWithin(the_geom, 1))) FROM all_locations

One Answer

I'd suggest to use the ST_ClusterDBSCAN Window function rather than the Aggregate function ST_ClusterWithin:

SELECT *,
       ST_ClusterDBSCAN(the_geom, eps := <distance>, minpoints := 1) OVER() AS clst_id
FROM   all_locations
;

clst_id will hold INT values representing the cluster each rows geometry belongs to.


As stated in the comments, ST_ClusterWithin will aggregate geometries that are separated by no more than the distance to each other; using minpoints := 1 in ST_ClusterDBSCAN will force the same effect.

Compare

WITH
    pts AS (
        SELECT  ST_MakePoint(n, 0) As geom
        FROM    Generate_Series(0, 5) AS n
    )

SELECT  dmp.clst_id
FROM    (
    SELECT  ST_ClusterWithin(geom, 1) AS cw
    FROM    pts
) AS    clst,
        UNNEST(clst.cw) WITH ORDINALITY AS dmp (clst, clst_id),
        LATERAL ST_Dump(ST_CollectionExtract(dmp.clst, 1)) AS extr
;


 clst_id |    geom    
---------+------------
       1 | POINT(0 0)
       1 | POINT(1 0)
       1 | POINT(2 0)
       1 | POINT(3 0)
       1 | POINT(4 0)
       1 | POINT(5 0)
(6 rows)

to

WITH
    pts AS (
        SELECT  ST_MakePoint(n, 0) As geom
        FROM    Generate_Series(0, 5) AS n
    )

SELECT ST_ClusterDBSCAN(geom, 1, 1) OVER() AS clst_id,
       ST_AsText(geom) AS geom
FROM   pts
;


 clst_id |    geom    
--------+------------
      0 | POINT(0 0)
      0 | POINT(1 0)
      0 | POINT(2 0)
      0 | POINT(3 0)
      0 | POINT(4 0)
      0 | POINT(5 0)
(6 rows)

In both cases the geometries are stretched over a total distance of 5 degrees, but count as one and the same cluster (ST_ClusterDBSCAN starts counting at 0, whereas the ORDINALITY stars at 1) since they are within distance/eps of 1 degree to each other!

This behavior may change for minpoints > 1 (and on other data than the above), as there need to be at least minpoints core geometries within eps distance to get counted as cluster.

Needless to say, the latter approach is way less convoluted, and offers some nice functionality built into the windowing behavior (e.g. easy clustering over attributes etc.)


Note:

Both functions assume distance/eps in units of the underlying CRS; for a geographic reference system, this is degrees! Since there is no signature accepting GEOGRAPHY for neither of them, you will need to ST_Transform your data into a suitable projection to be able to work with metric/imperial units.

Answered by geozelot on December 26, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP