Search code examples
numpycomputational-geometrysparse-matrixkdtreesparse-array

Local maxima in a point cloud


I have a point cloud C, where each point has an associated value. Lets say the points are in 2-d space, so each point can be represented with the triplet (x, y, v).

I'd like to find the subset of points which are local maxima. That is, for some radius R, I would like to find the subset of points S in C such that for any point Pi (with value vi) in S, there is no point Pj in C within R distance of Pi whose value vj is greater that vi.

I see how I could do this in O(N^2) time, but that seems wasteful. Is there an efficient way to do this?


Side Notes:

  • The source of this problem is that I'm trying to find local maxima in a sparse matrix, so in my case x, y are ordered integer indeces - if this simplifies the problem let me know!
  • I'm perfectly happy if the solution is just for a manhattan distance or whatever.
  • I'm in python, so if there's some kind of nice vectorized numpy way to do this that's just great.

Solution

  • Following up on Yves' suggestion, here's an answer, which uses scipy's KDTree:

    from scipy.spatial.kdtree import KDTree
    import numpy as np
    
    def locally_extreme_points(coords, data, neighbourhood, lookfor = 'max', p_norm = 2.):
        '''
        Find local maxima of points in a pointcloud.  Ties result in both points passing through the filter.
    
        Not to be used for high-dimensional data.  It will be slow.
    
        coords: A shape (n_points, n_dims) array of point locations
        data: A shape (n_points, ) vector of point values
        neighbourhood: The (scalar) size of the neighbourhood in which to search.
        lookfor: Either 'max', or 'min', depending on whether you want local maxima or minima
        p_norm: The p-norm to use for measuring distance (e.g. 1=Manhattan, 2=Euclidian)
    
        returns
            filtered_coords: The coordinates of locally extreme points
            filtered_data: The values of these points
        '''
        assert coords.shape[0] == data.shape[0], 'You must have one coordinate per data point'
        extreme_fcn = {'min': np.min, 'max': np.max}[lookfor]
        kdtree = KDTree(coords)
        neighbours = kdtree.query_ball_tree(kdtree, r=neighbourhood, p = p_norm)
        i_am_extreme = [data[i]==extreme_fcn(data[n]) for i, n in enumerate(neighbours)]
        extrema, = np.nonzero(i_am_extreme)  # This line just saves time on indexing
        return coords[extrema], data[extrema]