Search code examples
phppostgresqlgeolocationgispostgis

Cluster points in PostGIS


I'm building an application that pulls lat/long values from a database and plots them on a Google Map. There could be thousands of data points so I "cluster" points close to each other so the user is not overwhelmed with icons. At the moment I perform this clustering in the application, with a simple algorithm like this:

  1. Get array of all points
  2. Pop first point off array
  3. Compare first point to all other points in array looking for ones that fall within x distance
  4. Create a cluster with the original and close points.
  5. Remove close points from array
  6. Repeat

Now I release this is inefficient and is the reason I have been looking into GIS systems. I have set up PostGIS and have my lat & longs stored in a POINT geometry object.

Can someone get me started or point me to some resources on a simple implementation of this clustering algorithm in PostGIS?


Solution

  • I ended up using a combination of snaptogrid and avg. I realize there are algorithms out there (i.e. kmeans as Denis suggested) that will give me better clusters but for what I'm doing this is fast and accurate enough.