I have a function that takes a csr_matrix
and does some calculations on it.
The behavior of these calculation requires the shape of this matrix to be specific (say NxM
).
The input I send has less columns and the exact number of rows.
(e.g. it has shape=(A,B) where A < N and B == M)
For example: I have the object x
>>>x = csr_matrix([[1,2],[1,2]])
>>>x
(0, 0) 1
(0, 1) 2
(1, 0) 1
(1, 1) 2
>>>x.shape
(2, 2)
And a function f
:
def f(csr_mat):
"""csr_mat.shape should be (2,3)"""
Then I want to do something on x
, so it will become y
:
>>>y = csr_matrix([[1,2,0],[1,2,0]])
>>>y
(0, 0) 1
(0, 1) 2
(1, 0) 1
(1, 1) 2
>>>y.shape
(2, 3)
In this example, x
and y
has the same none-zero values, but y
has different shape. What I want is to efficiently 'extend' x
to a new dimension, filling new columns with zeros. Namely, given x
and new_shape=(2,3)
, it should return y
.
I already tried reshape
:
x.reshape((2,3))
But then I got:
NotImplementedError
My second option was just to create new csr_matrix
with different shape:
z = csr_matrix(x,shape=(3,3))
But this fails as well:
NotImplementedError: Reshaping not implemented for csr_matrix.
EDIT: using csc_matrix has brought the same errors.
Any Ideas?
Thanks
In the CSR format, the underlying data
, indices
, and indptr
arrays for your desired y
are identical to those of your x
matrix. You can pass those to the csr_matrix
constructor with a new shape
:
y = csr_matrix((x.data, x.indices, x.indptr), shape=(2, 3))
Note that the constructor defaults to copy=False
, so this will share the data
, indices
, and indptr
between x
and y
. Some operations on y
will be reflected in x
. You can pass copy=True
to make x
and y
independent of each other.
If you want to poke at the undocumented internals of csr_matrix
, you can set the internal _shape
attribute to make the x
array have the shape you want:
x._shape = (2, 3)
There isn't really an advantage to doing it this way.