arrays python-3.x numpy multidimensional-array argmax

Is there a way to find the UNIQUE row indices of maximum columnar values in a 2D NumPy array?

For each column in a 2D NumPy array, the column's maximum value can appear more than once. I would like to find the row index for each column maximum, without repeating row indices.

Here is an example that demonstrates why np.argmax doesn't work:

import numpy as np

a = np.array([[1, 1, 0],
              [1, 0, 1],
              [0, 0, 1]])

ind = np.argmax(a, axis=0)

print(ind)

Output:

[0 0 2]

I want the result: [1, 0, 2] for this example.

That is:

The row index for the second column must be 0
This implies that the row index for the first column must be 1
This in turn implies that the row index for the third column must be 2

A slightly more complex example is this array:

a = np.array([[1, 1, 0],
              [1, 1, 1],
              [0, 0, 1]])

In this case, there is no column with a unique maximum value. I'd be happy with either of these answers:

[0, 1, 2]
[1, 0, 2]

An even more complex example is:

a = np.array([[1, 1, 1],
              [1, 1, 1],
              [0, 1, 1]])

In this case, I'd be happy with any of these answers:

[0, 1, 2]
[0, 2, 1]
[1, 0, 2]
[1, 2, 0]

I can solve these problems with loops and logical conditions, but I'm wondering if there is a way to solve the problem using numpy functions?

Solution

May be overkill, but you can use scipy.optimize.linear_sum_assignment:

from scipy.optimize import linear_sum_assignment

a = np.array([[1, 1, 0],
              [1, 0, 1],
              [0, 0, 1]])

linear_sum_assignment(-a.T)[1]
# array([1, 0, 2])

Note that you can always reduce to the 0,1 case using something like

abin = a==a.max(axis=0)

This can speed up the assignment quite a bit.

Alternatively, see this post for a graph theory solution.