I have an adjacency matrix of a graph A
. After A = A.sign()
there are still some elements that are not 1 or 0 or -1.
In [35]: A = A.sign()
In [36]: A.getcol(0).data
Out[36]:
array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 2.])
In [37]: A
Out[37]:
<519403x519403 sparse matrix of type '<type 'numpy.float64'>'
with 3819116 stored elements in COOrdinate format>
On the other hand numpy.sign()
works fine.
In [50]: a = A.getcol(0)
In [51]: np.sum(a.todense())
Out[51]: 58.0
In [52]: np.sum(np.sign(a.todense()))
Out[52]: 57.0
After some research I got the answer. It's all about the internal data structure Scipy
uses.
import numpy as np
from scipy.sparse import coo_matrix
xs = np.array([1, 2, 3, 3, 2])
ys = np.array([2, 3, 1, 1, 1])
A = coo_matrix((np.ones((5,)), (xs, ys)))
At this point A
is a <4x4 sparse matrix of type '<type numpy.float64'>' with 5 stored elements in COOrdinate format>
, although we have two elements in the same coordinate (3, 1)
. And A = A.sign()
only performs on the 5 elements, which are all 1 in the first place.
>>> A.data
array([ 1., 1., 1., 1., 1.])
>>> A.todense()
matrix([[ 0., 0., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 1., 0., 1.],
[ 0., 2., 0., 0.]])
>>> A = A.sign()
>>> A.todense()
matrix([[ 0., 0., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 1., 0., 1.],
[ 0., 2., 0., 0.]])