Search code examples
javageotoolsmap-matching

Java Geotools: Snap to line identifiying line that was snapped to


I am trying to write a Java program that will snap a large series of GPS co-ordinates to a line shapefile (a road network) and return not just the new co-ordinates, but a unique identifier for the line segment snapped to. It doesn't matter if this identifier is a FID, an "index" as used in other languages (ie where 1 is the first feature etc) or any column in the attribute table.

I have done this in R using the maptools::snapPointsToLines function but this is not scalable given the volumes of data I need to process, so I'm looking for Java to process the data more quickly for analysis in R.

My code (below) is currently very similar to the geotools tutorial for snapping, with the minor differences that I read in a (19 million line) CSV of GPS points instead of generating them, and I write a CSV of results. It snaps fine, and much quicker than what I was getting, but I have no idea how to identify the line snapped to. The documentation available seems to cover querys and filtering on feature sets which I cannot make applicable especially to the index line object thing this code creates, and the existing function in my code toString() returns something unintelligible for my purposes, for instance com.vividsolutions.jts.linearreff.LocationIndexedLine@74cec793.

Basically, I just want the lineID field to produce something that any other GIS software or language can match to a specific road segment.

package org.geotools.tutorial.quickstart;

import java.io.*;
import java.util.List;
import java.util.Arrays;

import com.vividsolutions.jts.geom.Coordinate;
import com.vividsolutions.jts.geom.Envelope;
import com.vividsolutions.jts.geom.Geometry;
import com.vividsolutions.jts.geom.LineString;
import com.vividsolutions.jts.geom.MultiLineString;
import com.vividsolutions.jts.index.SpatialIndex;
import com.vividsolutions.jts.index.strtree.STRtree;
import com.vividsolutions.jts.linearref.LinearLocation;
import com.vividsolutions.jts.linearref.LocationIndexedLine;

import org.geotools.data.FeatureSource;
import org.geotools.data.FileDataStore;
import org.geotools.data.FileDataStoreFinder;
import org.geotools.feature.FeatureCollection;
import org.geotools.geometry.jts.ReferencedEnvelope;
import org.geotools.swing.data.JFileDataStoreChooser;
import org.geotools.util.NullProgressListener;
import org.opengis.feature.Feature;
import org.opengis.feature.FeatureVisitor;
import org.opengis.feature.simple.SimpleFeature;
import com.opencsv.*;

public class SnapToLine {

    public static void main(String[] args) throws Exception {

        /*
         * Open a shapefile. You should choose one with line features
         * (LineString or MultiLineString geometry)
         * 
         */
        File file = JFileDataStoreChooser.showOpenFile("shp", null);
        if (file == null) {
            return;
        }

        FileDataStore store = FileDataStoreFinder.getDataStore(file);
        FeatureSource source = store.getFeatureSource();

        // Check that we have line features
        Class<?> geomBinding = source.getSchema().getGeometryDescriptor().getType().getBinding();
        boolean isLine = geomBinding != null 
                && (LineString.class.isAssignableFrom(geomBinding) ||
                    MultiLineString.class.isAssignableFrom(geomBinding));

        if (!isLine) {
            System.out.println("This example needs a shapefile with line features");
            return;
        }
         final SpatialIndex index = new STRtree();
        FeatureCollection features = source.getFeatures();
        //FeatureCollection featurecollection = source.getFeatures(Query.FIDS);
        System.out.println("Slurping in features ...");
        features.accepts(new FeatureVisitor() {

            @Override
            public void visit(Feature feature) {
                SimpleFeature simpleFeature = (SimpleFeature) feature;
                Geometry geom = (MultiLineString) simpleFeature.getDefaultGeometry();
                // Just in case: check for  null or empty geometry
                if (geom != null) {
                    Envelope env = geom.getEnvelopeInternal();
                    if (!env.isNull()) {
                        index.insert(env, new LocationIndexedLine(geom));
                    }
                }
            }
        }, new NullProgressListener());
 /*

 /*
         * We defined the maximum distance that a line can be from a point
         * to be a candidate for snapping 
         */

        ReferencedEnvelope bounds = features.getBounds();
        final double MAX_SEARCH_DISTANCE = bounds.getSpan(0) / 1000.0;



        int pointsProcessed = 0;
        int pointsSnapped = 0;
        long elapsedTime = 0;
        long startTime = System.currentTimeMillis();
        double longiOut;
        double latiOut;
        int moved;
        String lineID   = "NA";

        //Open up the CSVReader. Reading in line by line to avoid memory failure.

        CSVReader csvReader = new CSVReader(new FileReader(new File("fakedata.csv")));
        String[] rowIn;



        //open up the CSVwriter
        String outcsv = "fakedataOUT.csv";
        CSVWriter writer = new CSVWriter(new FileWriter(outcsv));



        while ((rowIn = csvReader.readNext()) != null) {

            // Get point and create search envelope
            pointsProcessed++;
            double longi = Double.parseDouble(rowIn[0]);
            double lati  = Double.parseDouble(rowIn[1]);
            Coordinate pt = new Coordinate(longi, lati);
            Envelope search = new Envelope(pt);
            search.expandBy(MAX_SEARCH_DISTANCE);

            /*
             * Query the spatial index for objects within the search envelope.
             * Note that this just compares the point envelope to the line envelopes
             * so it is possible that the point is actually more distant than
             * MAX_SEARCH_DISTANCE from a line.
             */
            List<LocationIndexedLine> lines = index.query(search);

            // Initialize the minimum distance found to our maximum acceptable
            // distance plus a little bit
            double minDist = MAX_SEARCH_DISTANCE + 1.0e-6;
            Coordinate minDistPoint = null;

            for (LocationIndexedLine line : lines) {
                LinearLocation here = line.project(pt);
                Coordinate point = line.extractPoint(here);
                double dist = point.distance(pt);
                if (dist < minDist) {
                    minDist = dist;
                    minDistPoint = point;
                    lineID = line.toString();
                }
            }


            if (minDistPoint == null) {
                // No line close enough to snap the point to
                System.out.println(pt + "- X");
                longiOut = longi;
                latiOut  = lati;
                moved    = 0;
                lineID   = "NA";
            } else {
                System.out.printf("%s - snapped by moving %.4f\n", 
                        pt.toString(), minDist);
                longiOut = minDistPoint.x;
                latiOut  = minDistPoint.y;
                moved    = 1;        
                pointsSnapped++;
            }
    //write a new row

    String [] rowOut = {Double.toString(longiOut), Double.toString(latiOut), Integer.toString(moved), lineID}; 
    writer.writeNext(rowOut);
        }

        System.out.printf("Processed %d points (%.2f points per second). \n"
                + "Snapped %d points.\n\n",
                pointsProcessed,
                1000.0 * pointsProcessed / elapsedTime,
                pointsSnapped);
        writer.close();
    }
}

I am not only new to Java but only self trained in domain specific languages like R; I am not a coder so much as someone who uses code, so if the solution seems obvious I may be lacking in elementary theory!

p.s I am aware that there are better map matching solutions out there (graphhopper etc), I'm just trying to start out eas!

Thankyou!


Solution

  • I would try to avoid going so far down the JTS rabbit hole and stick with GeoTools (of course I am a GeoTools dev so I would say that).

    First I'd use an SpatialIndexFeatureCollection to hold my lines (assuming they fit in memory otherwise a PostGIS table is the way to go). This saves me having to build my own index.

    Then I'd use a CSVDataStore to save parsing my own points out of the GPS stream (because I'm lazy and there's plenty to go wrong there too).

    This means that the bulk of the work boils down to this loop, DWITHIN finds all features with in the specified distance:

    try (SimpleFeatureIterator itr = pointFeatures.getFeatures().features()) { 
      while (itr.hasNext()) {
        SimpleFeature f = itr.next();
        Geometry snapee = (Geometry) f.getDefaultGeometry();
        Filter filter = ECQL.toFilter("DWITH(\"the_geom\",'" + writer.write(snapee) + "'," + MAX_SEARCH_DISTANCE + ")");
        SimpleFeatureCollection possibles = indexed.subCollection(filter);
        double minDist = Double.POSITIVE_INFINITY;
        SimpleFeature bestFit = null;
        Coordinate bestPoint = null;
        try (SimpleFeatureIterator pItr = possibles.features()) {
          while (pItr.hasNext()) {
            SimpleFeature p = pItr.next();
            Geometry line = (Geometry) p.getDefaultGeometry();
    
            double dist = snapee.distance(line);
            if (dist < minDist) {
              minDist = dist;
              bestPoint = DistanceOp.nearestPoints(snapee, line)[1];
              bestFit  = p;
            }
          }
        }
    

    At the end of that loop you should know the nearest feature (bestFit) from the lines (including its id & name etc.), the closest point (bestPoint) and the distance moved (minDist).

    Again I'd probably use a CSVDatastore to write the features back out too.

    If you have millions of points I'd probably look at using a FilterFactory to create the filter directly instead of using the ECQL parser.