Search code examples
pythonnumpy2dmode

Most efficient way to find mode in numpy array


I have a 2D array containing integers (both positive or negative). Each row represents the values over time for a particular spatial site, whereas each column represents values for various spatial sites for a given time.

So if the array is like:

1 3 4 2 2 7
5 2 2 1 4 1
3 3 2 2 1 1

The result should be

1 3 2 2 2 1

Note that when there are multiple values for mode, any one (selected randomly) may be set as mode.

I can iterate over the columns finding mode one at a time but I was hoping numpy might have some in-built function to do that. Or if there is a trick to find that efficiently without looping.


Solution

  • Check scipy.stats.mode() (inspired by @tom10's comment):

    import numpy as np
    from scipy import stats
    
    a = np.array([[1, 3, 4, 2, 2, 7],
                  [5, 2, 2, 1, 4, 1],
                  [3, 3, 2, 2, 1, 1]])
    
    m = stats.mode(a)
    print(m)
    

    Output:

    ModeResult(mode=array([[1, 3, 2, 2, 1, 1]]), count=array([[1, 2, 2, 2, 1, 2]]))
    

    As you can see, it returns both the mode as well as the counts. You can select the modes directly via m[0]:

    print(m[0])
    

    Output:

    [[1 3 2 2 1 1]]