Search code examples
javapostgresqlpostgisopenstreetmap

POSTGIS: querying for nearest point efficiently


I have the following issue: I have millions of entries (points in a map) and I want to map them to the nearest road (OSM data). I have implemented a brute force approach in java, where I query an OSM database to locate the nearest road and then I assign the osm object (road-id) to the corresponding entry.

However this approach is inefficient since the query for each entry takes almost 1 second which makes the whole process endless.

        PreparedStatement psWithLastPosition = this.conn.prepareStatement(sql);

        int i = 0;
        for (FCDEntry entry : dataset.getEntries()) {
            i += 1;
            String sqlQueryRoad = "SELECT osm_id,highway FROM planet_osm_roads ORDER BY " 
            + "ST_DISTANCE('SRID=4326;" 
            + ST_SetSRID.setSRID(new ST_MakePoint().createPoint(entry.getLongitude(),entry.getLatitude()), 4326).toString()
            + "'::geometry, " 
            + "ST_Transform(way::geometry,4326)) ASC LIMIT 1;";

            PreparedStatement psID = this.conn.prepareStatement(sqlQueryRoad);
            ResultSet rs = psID.executeQuery();
            String osm_id ="";
            while (rs.next()) {
                osm_id = rs.getString("osm_id");
            }

            log.info(osm_id);

            PreparedStatement ps = psWithLastPosition;
            ps.setString(1, entry.getAssetId());
            ...
            ps.setInt(18, Integer.valueOf(osm_id));
            ps.addBatch();

            // Execute every 1000 items
            if (i % 1000 == 0 || i == dataset.getEntries().size()) {
                log.info(i + "/" + dataset.getEntries().size());
                psWithLastPosition.executeBatch();
                psWithoutLastPosition.executeBatch();
            }

Any idea on how to speed up the mapping process ?


Solution

  • The solution is based on the comments (thanks to @JGH).

    I had to change the st_distance command with the <-> operator. The query now is almost x1000 faster.

    i.e., my query now is:

    String sqlQueryRoad = "SELECT osm_id,highway FROM planet_osm_roads ORDER BY " 
                        + "way <-> ST_Transform(ST_GeomFromText('POINT ("+ entry.getLongitude() + " " 
                        + entry.getLatitude()+ ")',4326), 3857) ASC LIMIT 1;";
    

    For some reason instead of geom column I had way column associated with geometry (almost all posts I found had geom column, best guess they changed osm2pgsql insertion function for OSM data).