Search code examples
pythonnumpycosine-similarity

How to calculate weighted average on a traingular similarity matrix


I have a triangular similarity matrix like this.

[[3, 1, 2, 0],
 [1, 3, 0, 0],
 [1, 0, 0, 0],
 [0, 0, 0, 0]]

How do I calculate a weighted average for each row while discarding the zero elemets?


Solution

  • You could add along the second axis, and divide by the sum over the amount of non-zero values per row. Then with where in np.divide you can divide where a condition is satisfied, which by setting it to a mask specifying where non-zero values are, you can prevent getting a division by zero error:

    a = np.array([[3, 1, 2, 0],
                  [1, 3, 0, 0],
                  [1, 0, 0, 0],
                  [0, 0, 0, 0]])
    
    m = (a!=0).sum(1)
    np.divide(a.sum(1), m, where=m!=0)
    # array([2., 2., 1., 0.])