This seems simple, but I would like to do this as efficiently as possible.
Effectively, I am interested in the points that are highest, lowest, most right, and most left.
Given an Array like [[10,2],[0,2],[1,10],[1,0],[2,3],[5,2],[7,2],[7,3],[3,8],[6,1]]
I have done this
max_x = max([p[0] for p in pts])
min_x = min([p[0] for p in pts])
max_y = max([p[1] for p in pts])
min_y = min([p[1] for p in pts])
But I don't just need the max_x
value though. I need the entire point, and I would rather not iterate though the list more than needed (for the sake of speed with large input).
Bonus points if it's general for N-dimensional points (highest and lowest in each dimension).
Numpy was designed for this kind of problem. It is a performant implementation of multi-dimensional numerical arrays (nested lists of regular shape):
import numpy as np
pts = [[10,2],[0,2],[1,10],[1,0],[2,3],[5,2],[7,2],[7,3],[3,8],[6,1]]
arr = np.array(pts)
max_idx = np.argmax(arr, axis=0)
min_idx = np.argmin(arr, axis=0)
max_x, max_y = arr[max_idx]
min_x, min_y = arr[min_idx]
output:
# max_x, max_y, min_x, min_y
array([10, 2])
array([ 1, 10])
array([0, 2])
array([1, 0])
Performance comparison of lists vs arrays for large N
from random import random
N = int(1e7) # 10m points
def list_version(N):
pts = [[random(), random()] for j in range(N)]
max_x = max(pts, key = lambda x: x[0])
max_y = max(pts, key = lambda x: x[1])
min_x = min(pts, key = lambda x: x[0])
min_y = min(pts, key = lambda x: x[1])
return max_x, min_x, max_y, min_y
def arr_version(N):
arr = np.random.random(size=(N,2))
max_idx = np.argmax(arr, axis=0)
min_idx = np.argmin(arr, axis=0)
max_x, max_y = arr[max_idx]
min_x, min_y = arr[min_idx]
return max_x, min_x, max_y, min_y
%timeit list_version(N)
4.62 s ± 25.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit arr_version(N)
269 ms ± 2.28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)