Search code examples
pythonnumpygisvectorizationinterpolation

Fastest way to get all the points between two (X,Y) coordinates in python


So I have a shapely LineString:

print np.round(shapely_intersecting_lines.coords).astype(np.int) 
>>> array([[ 1520, -1140],
           [ 1412,  -973]])

This can be interpreted as a numpy array as well as seen above.

I want to get all the points in between, that is I want to get the points of the line in between as integer values. The output should be something like this:

array([[ 1520, -1140],
       [ 1519, -1139],
       [ 1519, -1138],
       ..., 
       [ 1413,  -975],
       [ 1412,  -974],
       [ 1412,  -973]], dtype=int32)

I posted this earlier in gis.stackexchange hoping there was a solution in shapely that was efficient. The solution was good at first, however, the solution is now too slow as I run this over 50000 times in my code. On my computer each loop takes about 0.03s resulting in over a day of running. It is too slow for what I need here and was hoping to see if anyone knows of a vectorized solution to this.


Solution

  • Bresenham may be smart but I'm pretty sure brute force vectorization is faster. I've written two variants - the first is easier to read, the second is faster (80 us vs 50 us).

    Update Fixed a bug (thanks @Varlor) and added an nd variant.

    import numpy as np
    from timeit import timeit
    
    def connect(ends):
        d0, d1 = np.abs(np.diff(ends, axis=0))[0]
        if d0 > d1: 
            return np.c_[np.linspace(ends[0, 0], ends[1, 0], d0+1, dtype=np.int32),
                         np.round(np.linspace(ends[0, 1], ends[1, 1], d0+1))
                         .astype(np.int32)]
        else:
            return np.c_[np.round(np.linspace(ends[0, 0], ends[1, 0], d1+1))
                         .astype(np.int32),
                         np.linspace(ends[0, 1], ends[1, 1], d1+1, dtype=np.int32)]
    
    
    def connect2(ends):
        d0, d1 = np.diff(ends, axis=0)[0]
        if np.abs(d0) > np.abs(d1): 
            return np.c_[np.arange(ends[0, 0], ends[1,0] + np.sign(d0), np.sign(d0), dtype=np.int32),
                         np.arange(ends[0, 1] * np.abs(d0) + np.abs(d0)//2,
                                   ends[0, 1] * np.abs(d0) + np.abs(d0)//2 + (np.abs(d0)+1) * d1, d1, dtype=np.int32) // np.abs(d0)]
        else:
            return np.c_[np.arange(ends[0, 0] * np.abs(d1) + np.abs(d1)//2,
                                   ends[0, 0] * np.abs(d1) + np.abs(d1)//2 + (np.abs(d1)+1) * d0, d0, dtype=np.int32) // np.abs(d1),
                         np.arange(ends[0, 1], ends[1,1] + np.sign(d1), np.sign(d1), dtype=np.int32)]
    
    
    def connect_nd(ends):
        d = np.diff(ends, axis=0)[0]
        j = np.argmax(np.abs(d))
        D = d[j]
        aD = np.abs(D)
        return ends[0] + (np.outer(np.arange(aD + 1), d) + (aD//2)) // aD
    
    
    ends = np.array([[ 1520, -1140],
                     [ 1412,  -73]])
    
    ends_4d = np.array([[  100, -302, 101, -49],
                        [ -100,  -45, 112, 100]])
    
    print(connect(ends))
    print(connect_nd(ends_4d))
    
    
    assert np.all(connect(ends)==connect2(ends))
    assert np.all(connect(ends)==connect_nd(ends))
    assert np.all(connect(ends)==connect(ends[:, ::-1])[:, ::-1])
    assert np.all(connect(ends)==connect(ends[::-1])[::-1])
    
    print(timeit('f(ends)', globals={'f': connect, 'ends': ends}, number=10000)*100, 'us')
    print(timeit('f(ends)', globals={'f': connect2, 'ends': ends}, number=10000)*100, 'us')
    print(timeit('f(ends)', globals={'f': connect_nd, 'ends': ends}, number=10000)*100, 'us')
    

    Sample output:

    [[ 1520 -1140]
     [ 1520 -1139]
     [ 1520 -1138]
     ..., 
     [ 1412   -75]
     [ 1412   -74]
     [ 1412   -73]]
    [[ 100 -302  101  -49]
     [  99 -301  101  -48]
     [  98 -300  101  -48]
     ..., 
     [ -98  -47  112   99]
     [ -99  -46  112   99]
     [-100  -45  112  100]]
    78.8237597000034 us
    48.02509490000375 us
    62.78072760001123 us