Search code examples
postgis

How to query postgis data by closest point and only return results for that point?


I have a postgis table of points, 460 million records. It has a timestamp & point column.

I'm building graphs based on this data, a list of values for each timestamp that belong to the closest point, leaflet sends the lat/long from the map (where the user clicked) to the script that generates the chart-ready data.

SELECT thevalue 
FROM thetable 
WHERE ST_DWithin (thepoint, ST_MakePoint($get_lon, $get_lat), 0.04) 
ORDER BY thedate 
LIMIT 1000

This works great (for some clicks) but there has to be a better/faster way, I'd like the query to know what point to listen to and only return values for that point. Is there a better function for this requirement?


Solution

  • What king of geometry do you have? what projection are you using? I'm going to assume that your points are in wgs84 (epsg:4326)

    If you want distances to be accurate, it's better to use geography in calculations:

    alter points_table add column geog geography
    update points_table set geog = geom::geography
    

    create an index, and run cluster and analyze to speed up queries

    create index my_index_geog on points_table using gist(geog) /* change geog for geom if using geometry */
    cluster points_table using my_index_geog
    analyze points_table
    

    to get the closest point:

    SELECT point_id 
    FROM points_table
    ORDER BY geog <-> ST_SetSrid(ST_MakePoint($get_lon, $get_lat),4326)::geography limit 1;
    

    all together to get the values:

    select value
    from table
    where point_id = (SELECT point_id 
    FROM points_table
    ORDER BY geog <-> ST_SetSrid(ST_MakePoint($get_lon, $get_lat),4326)::geography limit 1)
    order by thedate
    limit 1000;
    

    additionally I would suggest keeping a table that contains only the points id's and the geometry/geography so the closest-point query runs faster. If you create such table, called only_points, the query becomes:

    select value
    from table
    where point_id = (SELECT point_id 
    FROM only_points
    ORDER BY geog <-> ST_SetSrid(ST_MakePoint($get_lon, $get_lat),4326)::geography limit 1)
    order by thedate
    limit 1000;
    

    If you need to keep using geometry, then you'll need to create the index on the geometry, cluster based on geom and run the query:

    select value
    from table
    where point_id = (SELECT point_id 
    FROM points_table
    ORDER BY geom::geography <-> ST_SetSrid(ST_MakePoint($get_lon, $get_lat),4326)::geography limit 1)
    order by thedate
    limit 1000;
    

    It will be slower, however, because you'll be converting to geography on each step

    see KNN in Postgis and PostGIS geography type and indexes