Search code examples
numpyvectorizationreductionargmaxnumpy-ufunc

Numpy index of the maximum with reduction - numpy.argmax.reduceat


I have a flat array b:

a = numpy.array([0, 1, 1, 2, 3, 1, 2])

And an array c of indices marking the start of each "chunk":

b = numpy.array([0, 4])

I know I can find the maximum in each "chunk" using a reduction:

m = numpy.maximum.reduceat(a,b)
>>> array([2, 3], dtype=int32)

But... Is there a way to find the index of the maximum <edit>within a chunk</edit> (like numpy.argmax), with vectorized operations (no lists, loops)?


Solution

  • Borrowing the idea from this post.

    Steps involved :

    • Offset all elements in a group by a limit-offset. Sort them globally, thus limiting each group to stay at their positions, but sorting the elements within each group.

    • In the sorted array, we would look for the last element, which would be the group max. Their indices would be the argmax after offsetting down for the group lengths.

    Thus, a vectorized implementation would be -

    def numpy_argmax_reduceat(a, b):
        n = a.max()+1  # limit-offset
        grp_count = np.append(b[1:] - b[:-1], a.size - b[-1])
        shift = n*np.repeat(np.arange(grp_count.size), grp_count)
        sortidx = (a+shift).argsort()
        grp_shifted_argmax = np.append(b[1:],a.size)-1
        return sortidx[grp_shifted_argmax] - b
    

    As a minor tweak and possibly faster one, we could alternatively create shift with cumsum and thus have a variation of the earlier approach, like so -

    def numpy_argmax_reduceat_v2(a, b):
        n = a.max()+1  # limit-offset
        id_arr = np.zeros(a.size,dtype=int)
        id_arr[b[1:]] = 1
        shift = n*id_arr.cumsum()
        sortidx = (a+shift).argsort()
        grp_shifted_argmax = np.append(b[1:],a.size)-1
        return sortidx[grp_shifted_argmax] - b