Search code examples
pythonpandasscipydelaunay

Euclidean distance of Delaney triangulation - Scipy


The spatial package imported from Scipy can measure the Euclidean distance between specified points. Is it possible to return the same measurement by using the Delaunay package? Using the df below, the average distance between all points is measured grouped by Time. However, I'm hoping to use Delaunay triangulation to measure the average distance.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import Delaunay

df = pd.DataFrame({
    'Time' : [1,1,1,1,2,2,2,2],                  
    'A_X' : [5, 5, 6, 6, 4, 3, 3, 4], 
    'A_Y' : [5, 6, 6, 5, 5, 6, 5, 6],                         
        })

def make_points(x):
    return np.array(list(zip(x['A_X'], x['A_Y'])))

points = df.groupby("Time").apply(make_points)

for p in points:
    tri = Delaunay(p)
    ax.triplot(*p.T, tri.simplices)

Average distance between all points can be measured using below but I'm hoping to incorporate Delaunay.

 avg_dist = (df.groupby(['Time'])
             .apply(lambda x: spatial.distance.pdist
             (np.array(list(zip(x['A_X'], x['A_Y']))))
             .mean() if len(x) > 1 else 0)
             .reset_index()
             )

Intended Output:

   Time         0
0     1  1.082842
1     2  1.082842

Solution

  • You can try this function

    from itertools import combinations
    import numpy as np
        
    def edges_with_no_replacement(points):
        
        # get the unique coordinates
        points = np.unique(points.loc[:,['A_X','A_Y']].values,return_index=False,axis=0)
        if len(points) <= 1: return 0
        # for two points, no triangle
        # I think return the distance between the two points make more sense? You can change the return value to zero.
        if len(points) == 2: return np.linalg.norm(points[0]-points[1])
        
        tri = Delaunay(points)
        triangles = tri.simplices
        # get all the unique edges 
        all_edges = set([tuple(sorted(edge)) for item in triangles for edge in combinations(item,2)])
        # compute the average dist 
        return np.mean([np.linalg.norm(points[edge[0]]-points[edge[1]]) for edge in all_edges])
    

    This function will first find all the unique edges given triangles, then return the average length of the triangle edges. Apply this function

    avg_dist = (df.groupby(['Time']).apply(edges_with_no_replacement).reset_index())
    

    The output is

        Time    0
    0   1   1.082843
    1   2   1.082843
    

    Note that the function edges_with_no_replacement will still throw QhullError if points are on the same line, for example

    Delaunay(np.array([[1,2],[1,3],[1,4]]))
    

    So, you have to make sure the points are not on the same line.