Search code examples
pythonperformancemathcoordinateseuclidean-distance

Can co-ordinate distance (euclidean distance) comparison be simplified?


Question

I have a target co-ordinate pair (lat, long), and I have a list of other co-ordinate pairs. The units are in decimal degrees.

I'd like to find the pair closest to the target, not distance itself, so purely comparison.

More precisely, I'd like to know how simple of a calculation can I use / or what the fastest method is?

I need accuracy to the closest 10m and I'd like to not use modules for this.

Initial Solution

Initially, I figured it'd be easy as e.g.

coord_set = [{lat, long}, {..}]

target_lat = 20.1234
target_long = -120.5678

pair1 = abs(target_lat - coord_set[0][lat]) + abs(target_long - coord_set[0][long])
pair2 = abs(..) + abs(..)

if pair1 < pair2:
   closest = coord_set[0]
...

Tangent

The I thought perhaps I could simply compare totals i.e.

total = target_lat + target_long
(total - (pair1[lat] + pair1[long])) > (total - (pair2[lat] + pair2[long]))

But I found that has some issues.

Euclidean Distance

Finally I found that I should use euclidean distance (I don't need Great-circle distance, as these coordiantes are typically less than 2-3 km apart).

I found this SO Answer noting comparisons don't need to be square rooted, since distance isn't needed.

**But then I ask, why even square them? ** Simplifying -

pair1 = ((lat - pair1[lat])**2 + (long - pair1[long])**2)**(1/2)
pair2 = ((lat - pair2[lat])**2 + (long - pair2[long])**2)**(1/2)

To this

pair1 = (lat - pair1[lat])**2 + (long - pair1[long])**2
pair2 = (lat - pair2[lat])**2 + (long - pair2[long])**2

To finally this, which is essentially what I started with

pair1 = (lat - pair1[lat]) + (long - pair1[long])
pair2 = (lat - pair2[lat]) + (long - pair2[long])

Is it because it's a 2D plane? Or should I use it for better accuracy?

I'm struggling to visualise it on a graph, so I can't figure out why for my problem, I should use Euclidean distance


Solution

  • There are some mathematical inconsistencies in your reasoning.

    I'd like to find the pair closest to the target, not distance itself, so purely comparison.

    Finding the pair of coordinates closest to the target is mathematically the pair that is at the shortest distance from the target. So, it is distance itself.

    Square root is a monotone increasing function, so in order to compare distances you don't really need to pass by the square root, indeed. Thus, your first simplification is mathematically correct, but the second one isn't. As Neil mentioned in the first answer, latitude and longitude are signed values! So, you can't "simplify" by removing the squares.

    Here's an example:

    target = (0, 0)
    pair_1 = (-1, 1)
    pair_2 = (1, 1)
    

    Then, the squared distances would be:

    squared_distance_pair_1 = (0-(-1))**2 + (0-1)**2 = 1 + 1 = 2
    squared_distance_pair_2 = (0-1)**2 + (0-1)**2 = 1 + 1 = 2
    

    So you see that both points are equidistant from the target (the origin in this case). If you draw these points, you would easily see it's true.

    But if you inject these values in your second simplification expression:

    pair1 = (lat - pair1[lat]) + (long - pair1[long])
    pair2 = (lat - pair2[lat]) + (long - pair2[long])
    

    you'll get:

    pair1 = (0 - (-1)) + (0 - 1) = 0
    pair2 = (0 - 1) + (0 - 1) = -2
    

    which doesn't give you any information about the distance.

    In conclusion, you have to compute all distances and find the shortest one. And bear in mind that you may have more than one point at that same shortest distance from your target point:

    coords = [(lat, long), ...]
    target_lat = 2.0
    target_long = -3.0
    
    distances = [(c[0] - target_lat)**2 + (c[1] - target_long)**2 for c in coords]
    shortest_distance = min(distances)
    closest_pairs = [coords[i] for i, d in enumerate(distances) if d == shortest_distance]
    

    I wrote it this way because you said you didn't want to use any external libraries, but for better performances you could at least use numpy.

    Best regards.