Search code examples
pythonnumpyscipy

Check if two scipy.sparse.csr_matrix are equal


I want to check if two csr_matrix are equal.

If I do:

x.__eq__(y)

I get:

raise ValueError("The truth value of an array with more than one "
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

This, However, works well:

assert (z in x for z in y)

Is there a better way to do it? maybe using some scipy optimized function instead?

Thanks so much


Solution

  • Can we assume they are the same shape?

    In [202]: a=sparse.csr_matrix([[0,1],[1,0]])
    In [203]: b=sparse.csr_matrix([[0,1],[1,1]])
    In [204]: (a!=b).nnz==0   
    Out[204]: False
    

    This checks the sparsity of the inequality array.

    It will give you an efficiency warning if you try a==b (at least the 1st time you use it). That's because it has to test all those zeros. It can't take much advantage of the sparsity.

    You need a relatively recent version to use logical operators like this. Were you trying to use x.__eq__(y) in some if expression, or did you get error from just that expression?

    In general you probably want to check several parameters first. Same shape, same nnz, same dtype. You need to be careful with floats.

    For dense arrays np.allclose is a good way of testing equality. And if the sparse arrays aren't too large, that might be good as well

    np.allclose(a.A, b.A)
    

    allclose uses all(less_equal(abs(x-y), atol + rtol * abs(y))). You can use a-b, but I suspect that this too will give an efficiecy warning.