manipulate the element before finding sum of higher elements in the row

I have asked about finding sum of higher elements in the row/column and got really good answer. However this approach does not allow me to manipulate current element.

My input dataframe is something like this:

array([[-1,  7, -2,  1,  4],
       [ 6,  3,  -3,  5,  1]])

Basically, I would like to have a output matrix which shows me for each element how many values are higher in the given row and column, like this:

array([[3, 0, 4, 2, 1],
       [0, 2, 4, 1, 3]], dtype=int64)

scipy ranked function really works well here. (Thanks to @Tom)

the tricky part is here since this matrix is correlation matrix and scores are between -1 and 1,
I would like to add one middle step (normalization factor) before counting higher values:

If the element is negative, add +3 to that element and then count how many values are higher
If the element is positive, subtract -3 from that element and then count how many values are higher in the row.

e.g.:

first element of row is negative we add +3 and then row would be
2 7 -2 1 4 -> sum of the higher values from that element is 2
second element of row is positive we subtract -3 and then row would be
-1 4 -2 1 4 -> sum of the higher values from that element is 0

...

so we do this normalization for each row and row-wise desired output would be:

2 0 2 3 1 
1 3 4 2 3

I don't want to use loop for that because since the matrix is 11kx12k, it takes so much time. If I use ranked with lamda, than instead of doing for each element, It adds and subtracts in the same time to the all row values, which It is not what I want.

corr = np.array([[-1,  7, -2,  1,  4],
                 [ 6,  3,  -3,  5,  1]])


def element_wise_bigger_than(x, axis):
    return x.shape[axis] - rankdata(x, method='max', axis=axis)


ld = lambda t: t + 3 if t<0 else t-3
f = np.vectorize(ld)


element_wise_bigger_than(f(corr), 1)

Solution

A possible solution, based on numba and numba prange to parallelize the for loop:

from numba import jit, prange, njit, set_num_threads
import numpy as np

@njit(parallel=True)
def get_horizontal(a):
    z = np.zeros((a.shape[0], a.shape[1]), dtype=np.int32)
    
    for i in prange(a.shape[0]):
        for j in range(a.shape[1]):
            aux = a[i, j]
            
            if a[i, j] < 0:
                a[i, j] += 3
            elif a[i, j] > 0:
                a[i, j] -= 3
            else:
                pass
            
            z[i, j] = (a[i, j] < a[i, :]).sum()
            a[i, j] = aux
    return z

a = np.array([[-1,  7, -2,  1,  4],
       [ 6,  3,  -3,  5,  1]])

set_num_threads(6) # to use only 6 threads

get_horizontal(a)

Runtime:

By using the following array,

a = np.random.randint(-10, 10, size=(11000, 12000))

the runtime, on my computer, is less than 1 minute.

Output:

array([[2, 0, 2, 3, 1],
       [1, 3, 4, 2, 3]], dtype=int32)