Search code examples
pythonarraysmaxmin

Get point with highest and lowest x and y value in python


This seems simple, but I would like to do this as efficiently as possible.

Effectively, I am interested in the points that are highest, lowest, most right, and most left.

Given an Array like [[10,2],[0,2],[1,10],[1,0],[2,3],[5,2],[7,2],[7,3],[3,8],[6,1]]

I have done this

max_x = max([p[0] for p in pts])
min_x = min([p[0] for p in pts])
max_y = max([p[1] for p in pts])
min_y = min([p[1] for p in pts])

But I don't just need the max_x value though. I need the entire point, and I would rather not iterate though the list more than needed (for the sake of speed with large input).

Bonus points if it's general for N-dimensional points (highest and lowest in each dimension).


Solution

  • Numpy was designed for this kind of problem. It is a performant implementation of multi-dimensional numerical arrays (nested lists of regular shape):

    import numpy as np
    
    pts = [[10,2],[0,2],[1,10],[1,0],[2,3],[5,2],[7,2],[7,3],[3,8],[6,1]]
    arr = np.array(pts)
    max_idx = np.argmax(arr, axis=0)
    min_idx = np.argmin(arr, axis=0)
    max_x, max_y = arr[max_idx]
    min_x, min_y = arr[min_idx]
    

    output:

    # max_x, max_y, min_x, min_y
    array([10,  2])
    array([ 1, 10])
    array([0, 2])
    array([1, 0])
    

    Performance comparison of lists vs arrays for large N

    from random import random
    N = int(1e7) # 10m points
    
    def list_version(N):
        pts = [[random(), random()] for j in range(N)]
        max_x = max(pts, key = lambda x: x[0])
        max_y = max(pts, key = lambda x: x[1])
        min_x = min(pts, key = lambda x: x[0])
        min_y = min(pts, key = lambda x: x[1])
        return max_x, min_x, max_y, min_y
    
    def arr_version(N):
        arr = np.random.random(size=(N,2))
        max_idx = np.argmax(arr, axis=0)
        min_idx = np.argmin(arr, axis=0)
        max_x, max_y = arr[max_idx]
        min_x, min_y = arr[min_idx]
        return max_x, min_x, max_y, min_y
    
    %timeit list_version(N)
    4.62 s ± 25.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    %timeit arr_version(N)
    269 ms ± 2.28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)