Search code examples
pythonnumpymatrixrecommendation-engine

Make numpy matrix more sparse


Suppose I have a numpy array

np.array([
    [3, 0, 5, 3, 0, 1],
    [0, 1, 2, 1, 5, 2],
    [4, 3, 5, 3, 1, 4],
    [2, 5, 2, 5, 3, 1],
    [0, 1, 2, 1, 5, 2],
])

Now, I want to randomly replace some elements with 0. So that I have an output like this

np.array([
    [3, 0, 0, 3, 0, 1],
    [0, 1, 2, 0, 5, 2],
    [0, 3, 0, 3, 1, 0],
    [2, 0, 2, 5, 0, 1],
    [0, 0, 2, 0, 5, 0],
])

Solution

  • We can use np.random.choice(..., replace=False) to randomly select a number of unique non-zero flattened indices and then simply index and reset those in the input array.

    Thus, one solution would be -

    def make_more_sparsey(a, n):
        # a is input array
        # n is number of non-zero elements to be reset to zero
        idx = np.flatnonzero(a) # for performance, use np.flatnonzero(a!=0)
        np.put(a, np.random.choice(idx, n, replace=False),0)
        return a
    

    Sample run -

    In [204]: R = np.array([
         ...:     [3, 0, 5, 3, 0, 1],
         ...:     [0, 1, 2, 1, 5, 2],
         ...:     [4, 3, 5, 3, 1, 4],
         ...:     [2, 5, 2, 5, 3, 1],
         ...:     [0, 1, 2, 1, 5, 2],
         ...: ])
    
    In [205]: make_more_sparsey(R, n=5)
    Out[205]: 
    array([[3, 0, 5, 3, 0, 1],
           [0, 1, 0, 0, 5, 2],
           [4, 3, 5, 3, 1, 4],
           [2, 5, 0, 5, 3, 1],
           [0, 1, 0, 1, 0, 2]])