I have a 1D array of sorted non-unique numbers. The number of times they repeat is random. It is associated with an array of weights with the same size. For a given series of identical elements, the associated series of weights may or may not have repeated elements as well and in this whole array of weights, there may or may not be repeated elements. E.g:
pos = np.array([3, 3, 7, 7, 9, 9, 9, 10, 10])
weights = np.array([2, 10, 20, 8, 5, 7, 15, 7, 2])
I need to extract an array of unique elements of pos
, but where the unique element is the one with the greatest weight.
The working solution I came up with involves looping:
pos = np.array([3, 3, 7, 7, 9, 9, 9, 10, 10])
weights = np.array([2, 10, 20, 8, 5, 7, 15, 7, 2])
# Get the number of occurences of the elements in pos but throw away the unique array, it's not the one I want.
_, ucounts = np.unique(pos, return_counts=True)
# Initialize the output array.
unique_pos_idx = np.zeros([ucounts.size], dtype=np.uint32)
last = 0
for i in range(ucounts.size):
maxpos = np.argmax( weights[last:last+ucounts[i]] )
unique_pos_idx[i] = last + maxpos
last += ucounts[i]
# Result is:
# unique_pos_idx = [1 2 6 7]
but I’m not using much of the Python language or Numpy (apart from the use of numpy arrays) so I wonder if there is a more Pythonesque and/or more efficient solution than even a Cython version of the above?
Thanks
Here's one vectorized approach -
sidx = np.lexsort([weights,pos])
out = sidx[np.r_[np.flatnonzero(pos[1:] != pos[:-1]), -1]]
Possible improvement(s) on performance -
1] A faster way to get the sorted indices sidx
with scaling
-
sidx = (pos*(weights.max()+1) + weights).argsort()
2] The indexing at the end could be made faster with boolean-indexing
, specially when dealing with many such intervals/groupings -
out = sidx[np.concatenate((pos[1:] != pos[:-1], [True]))]
Runtime test
All approaches :
def org_app(pos, weights):
_, ucounts = np.unique(pos, return_counts=True)
unique_pos_idx = np.zeros([ucounts.size], dtype=np.uint32)
last = 0
for i in range(ucounts.size):
maxpos = np.argmax( weights[last:last+ucounts[i]] )
unique_pos_idx[i] = last + maxpos
last += ucounts[i]
return unique_pos_idx
def vec_app(pos, weights):
sidx = np.lexsort([weights,pos])
return sidx[np.r_[np.flatnonzero(pos[1:] != pos[:-1]), -1]]
def vec_app_v2(pos, weights):
sidx = (pos*(weights.max()+1) + weights).argsort()
return sidx[np.concatenate((pos[1:] != pos[:-1], [True]))]
Timings and verification -
For the setup, let's use the sample and tile it 10000
times with scaling, as we intend to create 1000
times more number of intervals. Also, let's use unique numbers in weights
, so that the argmax
indices aren't confused by identical numbers :
In [155]: # Setup input
...: pos = np.array([3, 3, 7, 7, 9, 9, 9, 10, 10,])
...: pos = (pos + 10*np.arange(10000)[:,None]).ravel()
...: weights = np.random.choice(10*len(pos), size=len(pos), replace=0)
...:
...: print np.allclose(org_app(pos, weights), vec_app(pos, weights))
...: print np.allclose(org_app(pos, weights), vec_app_v2(pos, weights))
...:
True
True
In [156]: %timeit org_app(pos, weights)
...: %timeit vec_app(pos, weights)
...: %timeit vec_app_v2(pos, weights)
...:
10 loops, best of 3: 56.4 ms per loop
100 loops, best of 3: 14.8 ms per loop
1000 loops, best of 3: 1.77 ms per loop
In [157]: 56.4/1.77 # Speedup with vectorized one over loopy
Out[157]: 31.864406779661017