I have the following issue: I have millions of entries (points in a map) and I want to map them to the nearest road (OSM data). I have implemented a brute force approach in java, where I query an OSM database to locate the nearest road and then I assign the osm object (road-id) to the corresponding entry.
However this approach is inefficient since the query for each entry takes almost 1 second which makes the whole process endless.
PreparedStatement psWithLastPosition = this.conn.prepareStatement(sql);
int i = 0;
for (FCDEntry entry : dataset.getEntries()) {
i += 1;
String sqlQueryRoad = "SELECT osm_id,highway FROM planet_osm_roads ORDER BY "
+ "ST_DISTANCE('SRID=4326;"
+ ST_SetSRID.setSRID(new ST_MakePoint().createPoint(entry.getLongitude(),entry.getLatitude()), 4326).toString()
+ "'::geometry, "
+ "ST_Transform(way::geometry,4326)) ASC LIMIT 1;";
PreparedStatement psID = this.conn.prepareStatement(sqlQueryRoad);
ResultSet rs = psID.executeQuery();
String osm_id ="";
while (rs.next()) {
osm_id = rs.getString("osm_id");
}
log.info(osm_id);
PreparedStatement ps = psWithLastPosition;
ps.setString(1, entry.getAssetId());
...
ps.setInt(18, Integer.valueOf(osm_id));
ps.addBatch();
// Execute every 1000 items
if (i % 1000 == 0 || i == dataset.getEntries().size()) {
log.info(i + "/" + dataset.getEntries().size());
psWithLastPosition.executeBatch();
psWithoutLastPosition.executeBatch();
}
Any idea on how to speed up the mapping process ?
The solution is based on the comments (thanks to @JGH).
I had to change the st_distance command with the <-> operator. The query now is almost x1000 faster.
i.e., my query now is:
String sqlQueryRoad = "SELECT osm_id,highway FROM planet_osm_roads ORDER BY "
+ "way <-> ST_Transform(ST_GeomFromText('POINT ("+ entry.getLongitude() + " "
+ entry.getLatitude()+ ")',4326), 3857) ASC LIMIT 1;";
For some reason instead of geom column I had way column associated with geometry (almost all posts I found had geom column, best guess they changed osm2pgsql insertion function for OSM data).